Web Content Extractor is the most powerful and easy-to-use data extraction software for web scraping, data mining or data extraction from the internet. Not a single string of code is required. Web data extraction is completely automatic. Web Content Extractor offers you a friendly, wizard-driven interface that will walk you through the process of building a data extraction pattern and creating crawling rules in a simple point-and-click manner. Web Content Extractor allows users to extract data from a particular web site with pages having a similar structure (such as online stores, shopping sites, e-commerce sites, financial sites, business directories, product catalogs, search engine results, and etc.). The extracted data can be exported to a variety of formats, including Microsoft Excel (CSV), Access, TXT, HTML, XML, SQL script, MySQL script and to any ODBC data source. This variety of export formats allows you to process and analyze data in your customary format.
Could reduce your data collection time by hours or days
mrliioadin
Pros
The learning curve is not that bad and there are a couple good video tutorials online. You'll mostly get the hang of it with a little trial and error. I would encourage using a a fairly simple website to practice with. One in which the data is consistently laid out in a predictable fashion will be most helpful for getting started. The trial will allow you to test whether the tool will work as expected on the website you intend to crawl.
Exporting your data is absolutely simple. And when it is crawling appropriately, it does so pretty quickly (of course dependent on your internet connection).
Cons
I have heard others complain about the customer service. I wouldn't be surprised but I have not had cause to try to acquire support. At times the program can get hung up. It will get stuck or get into a loop where it fails to crawl through URLs successfully. It's not difficult to reset, but it does require a little baby sitting. I set up an old laptop in a spare room and simply check on it every time I walk past. The built in keyboard shortcuts mean that getting it back on track when it gets stuck takes less than about 30 seconds.
Don't crawl more than a few hundred URLs at a time. Break them up into smaller blocks. Spread out your scheduled crawls.
If the website you crawl is unpredictable in the way with the way it presents its data, the program will misapply the data to the wrong fields. This isn't catastrophic in my experience. But it requires a little more work on the back end after you export to get all of the appropriate data in the appropriate fields.
Summary
It has revolutionized the way I collect data from the internet. Though it has its problems, the gains have outweighed the disadvantages. Test out the trial for yourself. Find the online video tutorials if you need them. If it seems like it's working for you, you probably won't regret your purchase. I certainly haven't.
The most useful of all extractors
dulouz11
Pros
Linear thinking, program goes one page to the next
Cons
Takes a awhile to learn, instructions minimal, tinkering required, had a few "can't do's"
Summary
I scraped lots of data from a large job site reliably but it took lots of testing. Works with table heavy data well, may be befuddled by more complex data arrangements.
Bad customer service
tinybushkingdom
Pros
Neat tool if you know how to use it.
Cons
No support at all. Need proficiency in HTML/java/vb programming in order to use independently.
Summary
Good but a few quirks.
bncplug
Pros
A small (ish) program quite good at saving web content. Can export it in a variety of ways.
Cons
No user manual and just a few examples of how to use it. You may need to add some VBScript for some extractions. The Tech support are not helpful and won't answer emails. The program (now at Version 4.0) is expensive for what it does.
Summary
I originally tested this product for a specific task. Newprosoft sent me a project file which worked fine, so I bought the program. Since then I cannot get a response from them.
Very good! Worth the money.
dulouz11
Pros
Works very well for regularly formatted data. The product worked well for me. I have evaluated about ten similar programs. This was the cheapest and most robust.
Cons
I can't program between extracting columns until the scrape is over. This results in me downloading more than I need too.
Summary
I liked this program because I could do the application work myself without hiring a programmer to tweak the program after I needed a small change. Some programs cost in the $1000 range, other offered a strange set of abilities that I had no use for. After using this program I have evaluated others but still no one matched.
One of the best for sure
mich029
Pros
* Easy to use GUI
* Fast & Reliable (can set multiple instances to run similtaneously)
* Advanced features such as being able to set time between hits, only index data from pages including "blah", dont follow links that include "blah" etc..
Cons
* Accuracy of visual form /GUI being able to pick up the correct elemtents could be better
* Uses a machine-gun spray style approach rather than a structured follow specific links approach which suits some sites but not others.
Summary
Overall I found this to be an excellent , easy to use product and well worth the low asking price when others out there are charging many hundereds of $$ and are really no better.