Go
New
Find
Notify
Tools
Reply
  
scroogle Login/Join
 
Member
posted
Can anyone provide a concise def easily understood by the average clod (me) of scraping, as applied to Web adverts

Is scraping Google different from scraping any other site

Is "scroogle" an abbreviation of the above and if so should it be capitalized

http://scroogle.org/


Thanks all most kindly
 
Posts: 657Reply With QuoteReport This Post
Member
Picture of zmježd
posted Hide Post
I posted ~ elsewhere.


Ceci n'est pas un seing.
 
Posts: 5149 | Location: R'lyehReply With QuoteReport This Post
Member
posted Hide Post
zm: I presume you are also tsu, in which case, yes I have been there, and that's what brought me here

I had hoped a WS'er might be able to elaborate

—Ceci n'est pas un seing--translation: If Cecil sees you first, he won't pass by
 
Posts: 657Reply With QuoteReport This Post
Member
Picture of zmježd
posted Hide Post
zm: I presume you are also tsu, in which case, yes I have been there, and that's what brought me here

Tsuwm here is tsuwm there. I am zmjezhd on both boards. I posted a couple of links on web scraping there.


Ceci n'est pas un seing.
 
Posts: 5149 | Location: R'lyehReply With QuoteReport This Post
Member
posted Hide Post
I can explain scraping briefly. A scraper, crawler, spider, etc. is a piece of code which downloads documents off of the internet. Typically, it will start with one website, download it, find links to other websites, travel to those, and download them, continuing on, often gathering millions of pages or more.

A google scraper can work in a couple of ways. One is to do a google search and download all of the relevant results. This can be combined with a normal spider, using the google results as seeds for a larger search.

As a result of all of this, Google limits the amount of results you can get from it in a short period of time. It never comes up in the course of normal use, but it severely limits scraping. Scroogle is an abbreviation of Google scraper, as the website appears to put up phony IP addresses allows scrapers to avoid the restrictions.

The capitalization is something of a more complicated issue. When I search the internet ussing Google, I am googling. The noun form is typically capitalized, while the verb form is not. I think the guiding principle here should be that Scroogle is the name of the service. If you capitalize WikiPedia, Amazon, and Yahoo!(pretend eBay doesn't exist), then you should capitalize Scroogle.
 
Posts: 886 | Location: IllinoisReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
Geez, zmj and dale, this is confusing for those of us who aren't familiar with the boards you're talking about.

I don't know about scroogle, but I can tell you that capitalizations as we know them are now changing. For example, most titles now, such as president, director of communication, etc., are no longer capitalized...at least according to our editor.
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
posted Hide Post
Sean, thank you for that excellent rundown

k: Interesting, I shall be on the lookout for the phenom

But we're not supposed to identify other boards of this kind as it violates protocol
 
Posts: 657Reply With QuoteReport This Post
Member
posted Hide Post
A good rule of thumb is that we should pretend that no other boards exist. This makes it easy to keep track of what is going on, as well as preventing inter-forum wars, which are ugly.
 
Posts: 886 | Location: IllinoisReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
quote:
But we're not supposed to identify other boards of this kind as it violates protocol

Dale, it's fine to mention another board once in awhile; don't worry about that. I assume the board you were referring to was AWAD. It's just not one of my favorites, and I hate giving them even the tiniest bit of PR.

Sean, I know you were being sarcastic (after all, we talk about OEDILF all the time), but do inter-forum wars exist? I suspect you are right, but I've not seen one before.
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
posted Hide Post
No, really, it's true, we're not supposed to name other such sites
 
Posts: 657Reply With QuoteReport This Post
Member
Picture of BobHale
posted Hide Post
I used to post on snopes, APS and FOTA , I currently post only here and at the OEDILF. I left one other site which I won't name because of the autocratic policies of the site owner.

There. Sky didn't fall, did it?


"No man but a blockhead ever wrote except for money." Samuel Johnson.
 
Posts: 9423 | Location: EnglandReply With QuoteReport This Post
Member
posted Hide Post
quote:
Originally posted by Kalleh:
Sean, I know you were being sarcastic (after all, we talk about OEDILF all the time), but do inter-forum wars exist? I suspect you are right, but I've not seen one before.


This is one of the rare times when I'm not being sarcastic. When topics migrate from one forum to another, bad things tend to happen. First, you tend to lose the context, and have weird comments like zmjezhd's, which doesn't make any sense. Second, you tend to have personality conflicts with people from different boards. You'd think we could all get along, but you'd be surprised. This lead into the topic of flame wars, which start on one forum and can migrate to others rather quickly, and cause difficulties. There are some other things, but I think that is enough.
 
Posts: 886 | Location: IllinoisReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
quote:
No, really, it's true, we're not supposed to name other such sites

Dale, I am an administrator here, and I can tell you that's just not true. We mention OEDILF all the time, as well as wordcraftjr. While I know a few of you also post on AWAD, I don't mention them because I have a personal dislike of that board, though occasionally others mention it. As Bob says, there are other boards talked about here (especially the APS) from time to time. Now if boards are continuously talked about, and linked to, in post after post, that's a different story. But it is fine to mention a board from time to time.

Now, Sean, I suppose you make a good point. As with anything else in life, moderation is the key. Sometimes, in order to make a point, I really must mention the other site. For example, if I bring up a word question from OEDILF, I will cite the author and the limerick; likewise, if I have a question about the word a day that I post on wordcraftjr, again I will mention that site.

I guess we've been lucky. We haven't ever had an inter-forum flamewar, even with the volatile AWAD forum.

This message has been edited. Last edited by: Kalleh,
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
Picture of zmježd
posted Hide Post
have weird comments like zmjezhd's, which doesn't make any sense

I guess it just didn't translate well. I found it understandable, and so did, I presume, Dale.


Ceci n'est pas un seing.
 
Posts: 5149 | Location: R'lyehReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
Yes, that's because you and dale both post on AWAD. I think what Sean meant was that it wasn't all that understandable for those who don't read or post on that other board. I do read AWAD every so often because occasionally their discussion of words is good.
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
Picture of zmježd
posted Hide Post
I think what Sean meant was that it wasn't all that understandable for those who don't read or post on that other board.

Ah, but Sean didn't ask me a question.


Ceci n'est pas un seing.
 
Posts: 5149 | Location: R'lyehReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
True.

I just checked that other forum, and for this word they did a better job than we did. Oh well!
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
Picture of wordmatic
posted Hide Post
quote:
For example, most titles now, such as president, director of communication, etc., are no longer capitalized...at least according to our editor.

In Associated Press style, which a lot of American PR offices use because they spend much of their energies speaking with the news media, a title is capitalized if it precedes the person's name, and is lower case if it follows the name. Thus:
Chief Executive Officer Kenneth Lay, but
Kenneth Lay, chief executive officer.
Wordmatic
 
Posts: 1390 | Location: Near Philadelphia, Pennsylvania, USAReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
That's interesting, Wordmatic. Our editor, who makes those rules for us based on what's happening, says even she has trouble with not capitalizing titles like chief executive officer.

I wonder the reason for capitalizing it the first time, and not the second time.
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
Picture of arnie
posted Hide Post
We generalise job titles when we include a name, but not otherwise. So we'd say Current Prime Minister Tony Blair is the slipperiest prime minister we've ever had.


Build a man a fire and he's warm for a day. Set a man on fire and he's warm for the rest of his life.
 
Posts: 10940 | Location: LondonReply With QuoteReport This Post
Member
Picture of BobHale
posted Hide Post
You might. I wouldn't. I think the inclusion of the "current" has moved "prime minister" in the sentence from the specific to the generic meaning no caps. Now if you had written

The Prime Minister, Tony Blair, is the slipperiest prime minister we've ever had.

I'd agree with you.


"No man but a blockhead ever wrote except for money." Samuel Johnson.
 
Posts: 9423 | Location: EnglandReply With QuoteReport This Post
Member
Picture of wordmatic
posted Hide Post
quote:
Originally posted by Kalleh:
I wonder the reason for capitalizing it the first time, and not the second time.

In the first usage, title preceding the name because part of it--becomes, as someone else explained, specific to that person and therefore part of his name, part of the proper noun.

When the title follows the name, the reasoning goes, it is a description of the person's job, not the name of that person.

I have to agree that some of these style rules seem illogical. I have trouble explaining to academic department chairs that AP style lowercases names of academic departments, but uppercases names of offices and committees (but not sub-committees!) So we have "department of chemistry," but "Office of Admissions" and "Committee on Promotion and Tenure." After 22 years of fighting battles with disbelieving faculty over the whims of the stylebook, I've actually given up completely on some of these. I now capitalize titles in most instances, because it makes our internal audience happy. Only when I am actually writing something to be sent to a newspaper do I adhere strictly to those aspects of AP style which upset our faculty the most.

WM
 
Posts: 1390 | Location: Near Philadelphia, Pennsylvania, USAReply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
Wordmatic, our editor has trouble explaining the rationale to me, too. I don't know why it is so complicated. On the other hand, I realize that I capitalize way too much. For example, all the names here (like Arnie) I usually capitalize, but sometimes I don't (because he doesn't). That inconsistency isn't correct, I know.
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
  Powered by Social Strata  
 


Copyright © 2002-12