At 12/31/05 07:24 PM, BigBlueBalls wrote:
Ok well I took into consideration of Newgrounds servers into account and I optimized my program so that it's softer than a flea on an elephants back, regarding the bandwidth.
Okay, okay. Enough said, man. Consider me educated now. #;-}>
I didn't fully realise the methods your program uses to comb through and gather data, and I assumed it was loading tons and tons of individual profiles like previous data miners did. I apologise for jumping to conclusions, but I think you can understand why I did so (for one thing, I have very little programming experience myself, and most of it is FORTRAN and matlab and other engineering/math things from college, nothing to do with comp sci/software programming, really).
According to ShittyKitty this was his method "9,000 profiles on Wednesdays, about 90,000 profiles on Sundays, and all 1.2 million possible profiles every month."
Yep. No one had told him there was anything wrong with that, and it's not like he was pulling 1.2 million profiles every day.
But the problem, it turned out, was how RAPIDLY the program itself pulled profiles. For example, if someone was only pulling 1000 profiles or so, but doing so within 1-2 minutes, that could have negative effects on the database server (even if only for that 1-2 minutes once per week and not for 1-2 hours like it takes to do 10s of 1000s of profiles for SK's NGTL).
So anyway, we've all been through that pain in the past, and the nature of the posts I saw from you on page 1 of this topic raised some red flags.
The posts from you on page 2 have lowered those red flags of mine, so my reservations are entirely withdrawn. In fact, I'm glad someone came up with a new way of automating stat collection without impacting NG negatively with mass profile drawing. Using the 50 most recent deposits list is something that listmakers like myself, Newgrundling, ramagi, jonthomson, Dogma, and many others have been doing for 3+ years, but none of us made a program that scans that 50 most recent listing all day every day to catch everyone depositing.
I'm just grabbing one page, "Top 50 Deposits" getting the exp points, their name and ID. Simplest, most efficient, up-to-date and softest way of grabbing the stats. No massive data transfers, just a program that checks one static html page every few minutes and that's it. So it shouldn't even be taxing the database at all, since the html page appears to be updated by a separate program, rather than directly connected to the database.
Again, thanks for the detailed explanations on the inner workings and so forth. No worries here anymore. At least not on that subject. #;-}>
If they see what I'm doing as taxing the server, then I might as well stop coming here because me voting on 5 movies a day certainly is hurting their server much worse than my program.
But then you wouldn't be seeing the ads on the site, either, which keep it afloat, so... heh. #;-}>
That being said, getting the other information (B/P or voting power lists) would have required something along the lines of mining profiles, so that's why I'm not going to bother with it now after reading those threads.
That was part of my worry at the start of this topic, and yep... there's no official daily b/p related listing on NG past the top 50, so you'd have to mine profiles then, and that's not the way to go, unless you want to do manual work or work out some way to do it without mass-profile-loading within short spans of times (if you automated a program to pull one profile per 30 seconds or something, that would be fine. Leave it running all night long and when you wake up in the morning... boom, top 500 b/p list is assembled for ya).
At 1/1/06 05:41 PM, BigBlueBalls wrote:
At 1/1/06 01:05 PM, ramagi wrote:
Well lets start with the obvious I wish you would have told me you were going to do this.
Not that I mind. Yet some sort of notice might of been nice.
Well I didn't really think of warning other list makers I'm coming up with this program. I kind of wanted it to be a surprise, but I guess I should have been more considerate of people like yourself who take the time to build them manually.
I don't think she meant you should have warned ALL The other listmakers like myself or others.... I think she meant that because her PARTICULAR listing that she's been doing for years now is the top 500 exp (now top 400) list and your listing is top 2000 exp and your method(s) for collecting the users on it pretty much overlap entirely with hers, and hers doesn't have any users that yours is missing in the top 400 span...
That this list makes her list obsolete... that it would have been polite for you to at least warn her, if not ask for permission (obviously she ain't the boss of you, so just letting her know would have been the thing to do) before unveiling this topic. It just would have been nice to give her a bit of a head's up since your new listing is like a huge comet to her listing's dinosaurness, understand?
For an illustration of my point, here's how she began her 1/1 update of the top 400 exp list on her site (http://ramagi.spasmbot.com/MainMenu.html)
"Well as a surpise I got yesterday with someone doing a top 2000 exp list. Well on that note this may be my last update. Since I checked the list and saw he had gotten everyone on my list. As long it list is going to continue, I see no sense in going on with mine. I'll probably make a final desicion over the next week."
This issue, however, is entirely separate... and far more subjective... than the database issue I brought up at the end of page 1, so... don't take it as connected. I just mention it because I saw ramagi's post and your reply to her.
I'm thinking of building a more accurate list of the 2000 panel of judges
I look forward to the 2000 panel of judges listing. That will truly be a unique listing if you can pull it off... you just need to keep track of the date at which each person in the top... oh, 3000-4000 exp users.... last depositing. And if a user wasn't active during the past month, they don't get into the listing even if they're in the top 2000 exp. etc. etc.