Bangumi crawler

A crawler for bangumi.tv. Can collect bangumi data(exclude game, manga, music, etc...) and store it in MongoDB.

Usage

Preparation

Require node.js >= 7.6.0

cd <paht-to-project> && npm install

After all packages installed

cp config.default.js config.js && vi config.js

Now you can change the config.

mongoDB: {
    user: 'user',
    password: 'password',
    host: 'localhost',
    port: '27017',
    db: 'db'
},
start: 1,
end: 241493

You should change the MongoDB info according to your MongoDB's option and the start and end bangumi ID.

Start crawling

nohup node app.js &!

By default, it will scan from the start id you set to the end id you set. After crawling, a file named failedList.txt will be created. It contains IDs with errors(such as network error).

You can also use a file to input the IDs you want to scan, for example

nohup node app.js -f list.txt &!

Ids in the input file should be separate by ",". You can use the failedList.txt as a input file.

License

The MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
script		script
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.js		app.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bangumi crawler

Usage

Preparation

Start crawling

License

About

Releases

Packages

Contributors 2

Languages

License

QShen3/BangumiCrawler

Folders and files

Latest commit

History

Repository files navigation

Bangumi crawler

Usage

Preparation

Start crawling

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages