Spanish:
https://rentry.co/agr95
Japanese:
https://crieit.net/posts/Bank-6249a7b59dfd9
About what was a manga bank.
The "manga bank" has disappeared.
It became a new illegal site called 13dl.me, but since the online reading style was abandoned and it became a download reach type, this article was also updated on the board, as a new milestone.
Past board
Until mangaBank disappears
https://crieit.net/boards/manga-B
Well, I've always been wondering how to write Javascript, so I wrote a program in Javascript
using node.js
to see what this new illegal site looks like.
It took me a while because I wasn't familiar with Promise
, but it had more documentation on Javascript
than Nim
, so it wasn't long before the program worked.
Install node.js
and install axios
, cheerio
with npm
package manager.
https://github.com/axios/axios#installing
A module that axios
requests https, it is a return value of Promise
type and can process response if the https request is successful.
cheerio
uses html text as an object and uses it as a css selector.
I used it like nokogiri
in Ruby
, like bueatifulsoup4
in python
, like goquery
in go
, like nimquery
in Nim
.
These convenience selectors have similar usages and are different, so if you are not accustomed to not being able to use them properly unless you examine them carefully, you can manage with regular expressions
even if you do not use them, so in that case it is the same even if the language is different. The standard regular expression style can be written in the same way.
go
also has a perl-style regular expression
module, though it's not a standard library.
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
License
BSD-3-Clause license
https://github.com/google/re2/
1. Base
const cheerio = require('cheerio'),
axios = require('axios');
// url = `https://13dl.me/`;
var url = `https://13dl.me/list/popular/`;
var counter = 0;
function recursive(url){
let temp_url = url;
axios.get(url)
.then((response) => {
if (response.status == 200){
let $ = cheerio.load(response.data);
$('a').each(function (i, e) {
let title = $(e).attr('title');
let link = $(e).attr('href');
if (title !== undefined){
let h = /^Home/.test(title),
po = /^Popular\s/.test(title),
p = /^Prev/.test(title),
pa = /^Page/.test(title),
n = /^Next/.test(title);
if (h || po || p || pa || n || title === ``){
//unless
} else {
counter++;
console.log(counter + `{` + title +`{` + link);
// console.log(counter,title);
// console.log(`____________________`);
}
}
if (title === `Next`){
url = link;
// recursive(url);
}
})
if (url !== temp_url){
recursive(url);
}
}
}).catch(function (e) {
console.log(e);
recursive(url);
});
}
recursive(url);
What kind of program is, it enumerates the contents on the site.
It call the function recursively, but I didn't know how to write it in Javascript, but it worked, so let's record the contents to be listed in the SQLite database
.
If you use it with awk
, you can create SQLite3
database files with this base program as well. Set it to csv
with the delimiter {
, read csv
with SQLite3
and save it, and it will be a database file.
Ruby
If you write Ruby program code similar to this,
require 'nokogiri'
require 'open-uri'
url = 'https://13dl.me/list/popular/'
counter = 0
threads = []
def recursive(counter,url,threads)
html = URI.open(url).read
doc = Nokogiri::HTML.parse(html)
html = nil
doc.css('a').each do |x|
xx = x.attr('title')
pa = /^Page/.match?("#{xx}")
p = /^Prev/.match?("#{xx}")
if xx && xx !='' && xx !='Home' && xx !='Popular Manga' then
unless pa | p then
n = /^Next/.match?("#{xx}")
link = ''
if n then
link = x.attr('href')
threads << Thread.new do
recursive(counter,link,threads)
end
else
counter += 1
puts "#{counter} #{xx}"
end
end
end
end
doc = nil
end
recursive(counter,url,threads)
threads.each(&:join)
python
!python -m pip install requests beautifulsoup4
import requests
from bs4 import BeautifulSoup
url = 'https://13dl.me/list/popular/'
counter = 0
def scrape(url, counter):
response = requests.get(url)
soup = BeautifulSoup(response.content)
response.close()
xx = soup.find_all("a",title=True)
next_url = ""
for x in xx:
if (x['title'] != '' and x['title'] != "Home" and x['title'] != "Popular Manga"):
pa_tf = bool(re.search("^Page",x['title']))
p_tf = bool(re.search("^Prev",x['title']))
if (pa_tf == False and p_tf == False):
if (x['title'] != "Next"):
counter += 1
print(counter,x['title'])
else:
next_url = x['href']
del xx,soup
return next_url, counter
while(url != ""):
url,counter = scrape(url, counter)
recursive
https://rentry.co/5ibqk
while
https://rentry.co/ir4b3
Nim
while
https://rentry.co/856s2
threadpool ... Work in Progress
https://rentry.co/ze8f4
perl5
Go
https://rentry.co/75gch
&sync.wait.Group{} ...Work in Progress
https://rentry.co/wikwa
contents amount ,page | channel , sync.Mutex
https://rentry.co/o8fp4
2. SQLite3
const cheerio = require('cheerio'),
axios = require('axios');
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('13dlme0.db',(err) => {
if(err) {
return console.error(err.message);
}
console.log('Connect to SQLite database.');
});
db.serialize(() => {
db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});
var url = `https://13dl.me/list/popular/`;
var counter = 0;
function recursive(url){
let temp_url = url;
axios.get(url)
.then((response) => {
if (response.status == 200){
let $ = cheerio.load(response.data);
$('a').each(function (i, e) {
let title = $(e).attr('title');
let link = $(e).attr('href');
if (title !== undefined){
let h = /^Home/.test(title),
po = /^Popular\s/.test(title),
p = /^Prev/.test(title),
pa = /^Page/.test(title),
n = /^Next/.test(title);
if (h || po || p || pa || n || title === ``){
//unless
} else {
counter++;
// console.log(counter + ` ** ` + title + ` ** ` + link);
db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
console.log(counter,title);
console.log(`____________________`);
}
}
if (title === `Next`){
url = link;
}
})
if (temp_url != url){
recursive(url);
};
}
}).catch(function (e) {
console.log(e);
recursive(url);
});
}
recursive(url);
db.close ()
I haven't.
How should I db.close()
?
3. SQLite3 + db.close()
const cheerio = require('cheerio'),
axios = require('axios');
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('13dlme_test1-1.db',(err) => {
if(err) {
return console.error(err.message);
}
console.log('Connect to SQLite database.');
});
db.serialize(() => {
db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});
;
var url = `https://13dl.me/list/popular/`;
var counter = 0;
function recursive(url){
const temp_url = url;
const promise1 = axios.get(url)
const promiseData1 = promise1.then((response) => {
if (response.status == 200){
let $ = cheerio.load(response.data);
$('a').each(function (i, e) {
const title = $(e).attr('title');
const link = $(e).attr('href');
if (title !== undefined){
const h = /^Home/.test(title),
po = /^Popular\s/.test(title),
p = /^Prev/.test(title),
pa = /^Page/.test(title),
n = /^Next/.test(title);
if (h || po || p || pa || n || title === ``){
//unless
} else {
counter++;
// console.log(counter + ` ** ` + title + ` ** ` + link);
db.serialize(() => {
db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
});
console.log(counter,title);
console.log(`____________________`);
}
}
if ((title === `Next`) && (link != undefined)){
url = link;
}
})
} else {
console.log(response);
}
if (temp_url !== url){
return recursive(url);
// `:keep going:`;
} else {
console.log('Close SQLite database.');
return `:stop:`;
}
}).catch(function (e) {
console.log(e);
});
Promise.all([promiseData1]).then((value) => {
if (value[0] === `:stop:`){
db.close((err) => {
if(err) {
console.error(err.message);
}
});
}
});
}
recursive(url);
4. async/await
const cheerio = require('cheerio'),
axios = require('axios');
const sqlite3 = require('sqlite3').verbose();
const db = new sqlite3.Database('13dlme_test2.db',(err) => {
if(err) {
return console.error(err.message);
}
console.log('Connect to SQLite database.');
});
db.serialize(() => {
db.run(`CREATE TABLE manga(id interger,title string,link string)`)
});
var url = `https://13dl.me/list/popular/`;
var counter = 0;
const recursive = async (url) => {
console.log(url);
const temp_url = url;
try {
const {data} = await axios.get(url);
const $ = cheerio.load(data)
$('a').each(function (i, e) {
const title = $(e).attr('title');
const link = $(e).attr('href');
if (title !== undefined){
const h = /^Home/.test(title),
po = /^Popular\s/.test(title),
p = /^Prev/.test(title),
pa = /^Page/.test(title),
n = /^Next/.test(title);
if (h || po || p || pa || n || title === ``){
//unless
} else {
counter++;
// console.log(counter + ` ** ` + title + ` ** ` + link);
db.run(`insert into manga(id,title) VALUES(?,?)`,[counter,title]);
console.log(counter,title);
console.log(`____________________`);
}
}
if ((title === `Next`) && (link != undefined)){
url = link;
}
});
if (temp_url !== url) {
recursive(url);
//`:keep going:`;
}else{
console.log('Close SQLite database.');
db.close();
//`:stop:`;
}
} catch(error) {
throw error;
}
}
recursive(url);
However, even in the case of 2
..., it seems that it can be written to the database even if it is not closed, so it seems that commit
is possible.
In the case of an old PC that writes to the hard disk, if you commit one by one or repeat open and close to write to the SQLite
file, the speed will be sacrificed and it will be slow, but you can create a table in memory
and write the data. If you write the result to the .db
file at the end, it may be good because you only have to commit at the end.
Each of the 3
and 4
program codes was written to close the database when all the data was written, but this should not be the case, but it was closed cleanly. Funny, but non-blocking
. There is room for pursuit, but it is difficult because the result does not result in an error.
There is nothing special to explain.
Ruby sqlite3
require 'nokogiri'
require 'open-uri'
require 'sqlite3'
url = 'https://13dl.me/list/popular/'
counter = 0
threads = []
SQL =<<EOS
create table manga(
id INTEGER PRIMARY KEY,
title text
);
EOS
db = SQLite3::Database.open("13dlme.db")
db.execute(SQL)
def recursive(counter,url,threads,db)
html = URI.open(url).read
doc = Nokogiri::HTML.parse(html)
html = nil
doc.css('a').each do |x|
xx = x.attr('title')
pa = /^Page/.match?("#{xx}")
p = /^Prev/.match?("#{xx}")
if xx && xx !='' && xx !='Home' && xx !='Popular Manga' then
unless pa | p then
n = /^Next/.match?("#{xx}")
link = ''
if n then
link = x.attr('href')
threads << Thread.new do
recursive(counter,link,threads,db)
end
else
counter += 1
puts "#{counter} #{xx}"
db.execute("insert into manga(id,title) values('#{counter}','#{xx}') ;")
end
end
end
end
doc = nil
end
recursive(counter,url,threads,db)
threads.each(&:join)
db.close
As a program, if the application termux
or userland
works on a smartphone etc., node.js
can be installed as a package even if it is not rooted, so it works.
In the case of termux
, you can write the above program by installing node.js
and axios
, cheerio
, sqlite3
with the npm package manager
, but userland
starts by selecting the OS, but depending on the OS, it may be apt
. There are some differences with apk
, but if you can install node.js
, the rest is almost the same as termux
.
For iOS
I have a terminal emulator app called iSh
, but I haven't tested it on iOS. I don't have a mac and I got an old iOS iphone, but I have a physiological dislike for a company called apple, so I can't touch it easily.
In Ruby
and python
, there are various differences depending on the OS before installing the library, but in the case of node.js
, it may be an advantage that such a difference does not occur. Whether javascript
is good or not, it should be rewritten so that it works only in the browser.
As a copyright manager, if there is a title in the list that you manage, take down the procedure.
When it was a manga bank
, the image data on the cloudflare
process was not allowed to be accessed from outside Japan by IP address
, but this time it seems that it is not so, and the title is also in Japanese in alphabet. It may be conscious of overseas demand because it corresponds to the word title.
Top comments (0)