ScrapeNinja

ÜCRETSİZ PREMIUM
Taraf Anthony | Güncelleyen 17 days ago | Data
Popülerlik

9.9 / 10

Gecikme

3,590ms

Hizmet Düzeyi

97%

Health Check

N/A

Tüm Tartışmalara Dön

Having trouble with extractor.

Rapid account: Ed 8 Mame
ed8mame
a month ago

I am attempting to pass a function to the extractor key of the payload object and for all the different variations I have tried I continue to receive extractor errors in my response.

Here is my most recent attempt using the same extractor function from ScrapeNinjaLiveSandbox for scrapping metadata.

The following code: “extractor”: “function (input, cheerio) {\n let $ = cheerio.load(input);\n let data = { \n image: $(‘meta[property=“og:image”]’).attr(‘content’),\n favicon: $(‘link[rel=“icon”]’).attr(“href”) ||\n $(‘link[rel=“shortcut icon”]’).attr(“href”),\n url: $(‘link[rel=“canonical”]’).attr(“href”),\n title: $(‘title’).text().trim(),\n description: $(‘meta[name=description]’).attr(‘content’),\n }\n \n let regex = /https?:\/\/?[\/]+/;\n // if relative url in files found - try to make it absolute\n if (data.url) {\n let m = data.url.match(regex);\n if (data.image && data.image[0] == ‘/’) {\n \tdata.image = m[0] + data.image;\n }\n \n if (data.favicon && data.favicon[0] == ‘/’) {\n \tdata.favicon = m[0] + data.favicon;\n }\n \n }\n \n return data;\n}”

Returns this error: target website extractor: { err: “Error: SyntaxError: Unexpected token ‘;’” }

My full code:
exports.myFunction = onRequest(async (_req: any, _res: any) => {
const url = “https://scrapeninja.p.rapidapi.com/scrape”;

const PAYLOAD = {
    "url": "https://www.ssense.com/en-us/men/product/rice-nine-ten/off-white-cut-off-hoodie/15545051",
    "method": "GET",
    "retryNum": 1,
    "geo": "us",
    "extractor": extract,
}; 

const options = {
method: “POST”,
headers: {
“content-type”: “application/json”,
“X-RapidAPI-Key”: “REMOVED”,
“X-RapidAPI-Host”: “scrapeninja.p.rapidapi.com”,
},
body: JSON.stringify(PAYLOAD),
};

try {
let res = await fetch(url, options);
let resJson = await res.json();

// Basic error handling. Modify if neccessary
if (!resJson.info || ![200, 404].includes(resJson.info.statusCode)) {
  throw new Error(JSON.stringify(resJson));
}

console.log("target website extractor", resJson.extractor);

} catch (e) {
console.error(e);
}
});

Am I doing something wrong?

Rapid account: Restyler
restyler Commented a month ago

Hi! RapidAPI formatting corrupts the code and it’s hard to copy&paste it into code editor from here. Can you please send you node.js script to contact@scrapeninja.net so I can look into it? (please attach the code as .txt attachment). You can also use any codepen-like website or github gists to paste your code and just send me the link.

Aşağıya yorum ekleyerek tartışmaya katılın:

Yeni yorumlar göndermek için giriş yapın / kaydolun