Wordcraft Community Home Page
scroogle

This topic can be found at:
https://wordcraft.infopop.cc/eve/forums/a/tpc/f/932607094/m/1651099483

September 27, 2006, 08:42
dalehileman
scroogle
Can anyone provide a concise def easily understood by the average clod (me) of scraping, as applied to Web adverts

Is scraping Google different from scraping any other site

Is "scroogle" an abbreviation of the above and if so should it be capitalized

http://scroogle.org/


Thanks all most kindly
September 27, 2006, 08:54
zmježd
I posted ~ elsewhere.


Ceci n'est pas un seing.
September 27, 2006, 09:57
dalehileman
zm: I presume you are also tsu, in which case, yes I have been there, and that's what brought me here

I had hoped a WS'er might be able to elaborate

—Ceci n'est pas un seing--translation: If Cecil sees you first, he won't pass by
September 27, 2006, 14:02
zmježd
zm: I presume you are also tsu, in which case, yes I have been there, and that's what brought me here

Tsuwm here is tsuwm there. I am zmjezhd on both boards. I posted a couple of links on web scraping there.


Ceci n'est pas un seing.
September 27, 2006, 18:08
Seanahan
I can explain scraping briefly. A scraper, crawler, spider, etc. is a piece of code which downloads documents off of the internet. Typically, it will start with one website, download it, find links to other websites, travel to those, and download them, continuing on, often gathering millions of pages or more.

A google scraper can work in a couple of ways. One is to do a google search and download all of the relevant results. This can be combined with a normal spider, using the google results as seeds for a larger search.

As a result of all of this, Google limits the amount of results you can get from it in a short period of time. It never comes up in the course of normal use, but it severely limits scraping. Scroogle is an abbreviation of Google scraper, as the website appears to put up phony IP addresses allows scrapers to avoid the restrictions.

The capitalization is something of a more complicated issue. When I search the internet ussing Google, I am googling. The noun form is typically capitalized, while the verb form is not. I think the guiding principle here should be that Scroogle is the name of the service. If you capitalize WikiPedia, Amazon, and Yahoo!(pretend eBay doesn't exist), then you should capitalize Scroogle.
September 27, 2006, 19:57
Kalleh
Geez, zmj and dale, this is confusing for those of us who aren't familiar with the boards you're talking about.

I don't know about scroogle, but I can tell you that capitalizations as we know them are now changing. For example, most titles now, such as president, director of communication, etc., are no longer capitalized...at least according to our editor.
September 28, 2006, 09:23
dalehileman
Sean, thank you for that excellent rundown

k: Interesting, I shall be on the lookout for the phenom

But we're not supposed to identify other boards of this kind as it violates protocol
September 28, 2006, 16:23
Seanahan
A good rule of thumb is that we should pretend that no other boards exist. This makes it easy to keep track of what is going on, as well as preventing inter-forum wars, which are ugly.
September 28, 2006, 20:46
Kalleh
quote:
But we're not supposed to identify other boards of this kind as it violates protocol

Dale, it's fine to mention another board once in awhile; don't worry about that. I assume the board you were referring to was AWAD. It's just not one of my favorites, and I hate giving them even the tiniest bit of PR.

Sean, I know you were being sarcastic (after all, we talk about OEDILF all the time), but do inter-forum wars exist? I suspect you are right, but I've not seen one before.
September 29, 2006, 09:30
dalehileman
No, really, it's true, we're not supposed to name other such sites
September 29, 2006, 11:00
BobHale
I used to post on snopes, APS and FOTA , I currently post only here and at the OEDILF. I left one other site which I won't name because of the autocratic policies of the site owner.

There. Sky didn't fall, did it?


"No man but a blockhead ever wrote except for money." Samuel Johnson.
September 29, 2006, 16:08
Seanahan
quote:
Originally posted by Kalleh:
Sean, I know you were being sarcastic (after all, we talk about OEDILF all the time), but do inter-forum wars exist? I suspect you are right, but I've not seen one before.


This is one of the rare times when I'm not being sarcastic. When topics migrate from one forum to another, bad things tend to happen. First, you tend to lose the context, and have weird comments like zmjezhd's, which doesn't make any sense. Second, you tend to have personality conflicts with people from different boards. You'd think we could all get along, but you'd be surprised. This lead into the topic of flame wars, which start on one forum and can migrate to others rather quickly, and cause difficulties. There are some other things, but I think that is enough.
September 29, 2006, 21:19
Kalleh
quote:
No, really, it's true, we're not supposed to name other such sites

Dale, I am an administrator here, and I can tell you that's just not true. We mention OEDILF all the time, as well as wordcraftjr. While I know a few of you also post on AWAD, I don't mention them because I have a personal dislike of that board, though occasionally others mention it. As Bob says, there are other boards talked about here (especially the APS) from time to time. Now if boards are continuously talked about, and linked to, in post after post, that's a different story. But it is fine to mention a board from time to time.

Now, Sean, I suppose you make a good point. As with anything else in life, moderation is the key. Sometimes, in order to make a point, I really must mention the other site. For example, if I bring up a word question from OEDILF, I will cite the author and the limerick; likewise, if I have a question about the word a day that I post on wordcraftjr, again I will mention that site.

I guess we've been lucky. We haven't ever had an inter-forum flamewar, even with the volatile AWAD forum.

This message has been edited. Last edited by: Kalleh,
September 29, 2006, 22:15
zmježd
have weird comments like zmjezhd's, which doesn't make any sense

I guess it just didn't translate well. I found it understandable, and so did, I presume, Dale.


Ceci n'est pas un seing.
September 30, 2006, 15:04
Kalleh
Yes, that's because you and dale both post on AWAD. I think what Sean meant was that it wasn't all that understandable for those who don't read or post on that other board. I do read AWAD every so often because occasionally their discussion of words is good.
October 01, 2006, 07:46
zmježd
I think what Sean meant was that it wasn't all that understandable for those who don't read or post on that other board.

Ah, but Sean didn't ask me a question.


Ceci n'est pas un seing.
October 01, 2006, 19:58
Kalleh
True.

I just checked that other forum, and for this word they did a better job than we did. Oh well!
October 02, 2006, 08:12
wordmatic
quote:
For example, most titles now, such as president, director of communication, etc., are no longer capitalized...at least according to our editor.

In Associated Press style, which a lot of American PR offices use because they spend much of their energies speaking with the news media, a title is capitalized if it precedes the person's name, and is lower case if it follows the name. Thus:
Chief Executive Officer Kenneth Lay, but
Kenneth Lay, chief executive officer.
Wordmatic
October 03, 2006, 21:15
Kalleh
That's interesting, Wordmatic. Our editor, who makes those rules for us based on what's happening, says even she has trouble with not capitalizing titles like chief executive officer.

I wonder the reason for capitalizing it the first time, and not the second time.
October 03, 2006, 23:49
arnie
We generalise job titles when we include a name, but not otherwise. So we'd say Current Prime Minister Tony Blair is the slipperiest prime minister we've ever had.


Build a man a fire and he's warm for a day. Set a man on fire and he's warm for the rest of his life.
October 04, 2006, 11:23
BobHale
You might. I wouldn't. I think the inclusion of the "current" has moved "prime minister" in the sentence from the specific to the generic meaning no caps. Now if you had written

The Prime Minister, Tony Blair, is the slipperiest prime minister we've ever had.

I'd agree with you.


"No man but a blockhead ever wrote except for money." Samuel Johnson.
October 04, 2006, 11:42
wordmatic
quote:
Originally posted by Kalleh:
I wonder the reason for capitalizing it the first time, and not the second time.

In the first usage, title preceding the name because part of it--becomes, as someone else explained, specific to that person and therefore part of his name, part of the proper noun.

When the title follows the name, the reasoning goes, it is a description of the person's job, not the name of that person.

I have to agree that some of these style rules seem illogical. I have trouble explaining to academic department chairs that AP style lowercases names of academic departments, but uppercases names of offices and committees (but not sub-committees!) So we have "department of chemistry," but "Office of Admissions" and "Committee on Promotion and Tenure." After 22 years of fighting battles with disbelieving faculty over the whims of the stylebook, I've actually given up completely on some of these. I now capitalize titles in most instances, because it makes our internal audience happy. Only when I am actually writing something to be sent to a newspaper do I adhere strictly to those aspects of AP style which upset our faculty the most.

WM
October 04, 2006, 22:42
Kalleh
Wordmatic, our editor has trouble explaining the rationale to me, too. I don't know why it is so complicated. On the other hand, I realize that I capitalize way too much. For example, all the names here (like Arnie) I usually capitalize, but sometimes I don't (because he doesn't). That inconsistency isn't correct, I know.