How to use a Proxy API with JavaScript
One of the benefits to site owners for having an API is the reduction of bandwidth used by scripts. But sometimes a popular website will not provide an API to their public data. In this case, if you still need their public data you may want to download it from a script. But the same website which does not provide an API may have speed limits in place by IP address. This leads to the use of Proxies to download public data. In this article, we will walk through how to use a proxy API with JavaScript.
The data retrieved from the Proxy API will contain lists of Proxies. We will then test each proxy before using it to download the website source code. Several years ago this was one way to determine search engine rankings. You could graph rankings over time for target domains using lists of keywords. But since then Google has created more tools to help site owners understand how they rank.
The irony of this is that enough API coverage by websites will help to cut bandwidth usage by scripts. When a script calls an API, the server only returns the requested data. This is efficient and uses minimal resources and bandwidth. Much less than downloading full web pages to retrieve small portions of data. Amazon is a good example of this. Sometimes sellers want to know what the competition is for their products. This way they can optimize pricing for their own products. Amazon has put a lot of work into creating a Product Advertising API. This API reduces the usefulness of actually downloading Amazon web pages. But not every company employs an army of engineers to create and manage APIs like Amazon.
JavaScript on the Server?
In the context of using a script to download pages of a website, we need to be using a script that runs on the server. JavaScript runs great in a browser on a web page, but what about on the server from a command line? That’s where Node.js comes to the rescue. Node.js allows you to execute JavaScript within a terminal or console. For this to work, we need to install Node and Node Package Manager (npm). Pick your operating system and follow the installation instructions.
The following list is a summary of the process we will follow to use a Proxy API with JavaScript:
A Five-Step Process
- Get an API Key
- Subscribe to the Quick Proxy API
- Spot check Proxies
- Use the Proxy API with JavaScript
- Save a list of proxies to a file for use later
Step 1. Get an API Key
First, we need to get an API Key. The Proxy API we will be using is hosted on the RapidAPI platform. Getting a key is a simple process that is free. Go to the RapidAPI home page and use an email address or social media account to connect.
Step 2. Subscribe to the Quick Proxy API
Once you are registered on the RapidAPI platform, the next step is to subscribe to the Quick Proxy API. You can do that by clicking on the blue button on the endpoints page which says “Subscribe to Test”:
There are actually several Proxy APIs you can choose from. But this is the first one I found which returned working proxies from my first API call.
After subscribing, use the online test interface to view the outputs. This will provide the information you need for the next step.
Step 3. Spot check Proxies
Before integrating with an API, it’s a good idea to make sure the output is good. For example, we want to make sure that we get the information we can use from the API. For testing a proxy we need an IP address and a port. Then we can use a simple script to test it. We will start with a code snippet using the Node.js Request implementation in JavaScript. If you run the below script without having Request installed you may see an error. This error will mention a missing module called “request”. If you see that error, run this after node and npm are installed: npm install request
const REQUEST = require('request'); // https://www.npmjs.com/package/request var options = { // The URL of the site you want to send a request to. // The site below simply prints the IP you are coming from. url: 'https://api.ipify.org', // The IP and Port of the proxy you want to send // your request through. Below is one which worked // initially that I retrieved from the API. proxy: 'http://51.158.68.133:8811' } REQUEST(options, function (error, response, body) { //If the Proxy worked we'll see the IP of the proxy returned console.log(body); });
When the Proxy is working you should see the Proxy IP when you save that text file and run it from the command line. If you don’t get valid proxies from the first few results you retrieve, try a different API. I tested results from three different APIs before picking the Quick Proxy API.
Step 4. Use the Proxy API with JavaScript
Once you have tested a few results from a proxy API you should have an idea of the results you will get. The next step is to integrate with the API to gather proxy lists.
Let’s start with the sample code on the Quick Proxy Endpoints page:
var request = require("request"); var options = { method: 'GET', url: 'https://quick-proxy1.p.rapidapi.com/getpage/https/1', headers: { 'x-rapidapi-host': 'quick-proxy1.p.rapidapi.com', 'x-rapidapi-key': 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', useQueryString: true } }; request(options, function (error, response, body) { if (error) throw new Error(error); console.log(body); });
Just replace the Xs with your rapidapi key to make it work for you.
Step 5. Optionally save a list of proxies to a JSON file
Once we have the list of Proxies in JSON format we’ll save it to a file to be looped through in our main script later. The full JavaScript code to capture a list of proxies is this:
var request = require("request"); var fs = require("fs"); var options = { method: 'GET', url: 'https://quick-proxy1.p.rapidapi.com/getpage/https/1', headers: { 'x-rapidapi-host': 'quick-proxy1.p.rapidapi.com', 'x-rapidapi-key': '[insert your own x-rapidapi-key]', useQueryString: true } }; request(options, function (error, response, body) { if (error) throw new Error(error); // Open a new file called proxies.json and write the list of proxies // to it in JSON format (from the API) fs.writeFile('proxies.json', body, function(err) { if (err) { return console.error(err); } }); });
Conclusion
In this article, we’ve walked through the process of using a proxy API with JavaScript. The results are saved into a JSON file. The results are paginated so you get 10 at a time. To build a large list of proxies changes the page number for each request. The page number is at the end of the URL (it is 1 in the code above). After saving the proxies, you can then use them in your language of preference to capture your data.
Once you have a list of proxies as long as you want, then you can follow this process to use them:
- Read the JSON file into a local parameter (decoding the JSON into a local array).
- Loop through the proxies
- For each proxy, use the spot-checking code in step 3 above to make sure it’s working
- If the proxy is working, use a modified version of the spot-checking code. Change the api.ipify.org address to what has the data you are looking to capture.
- Put the data you are looking to capture into a local database or set of files.
Proxy FAQ
Can I use the proxies all at the same time?
- Even though you are using a proxy you still need to wait some amount of time between requests. If you don’t you may get all your proxies blocked.
- Do not overload any site with requests from many proxies at the same time. If you do, the script will look like a Distributed Denial of Service attack which is not legal.
How do I manage many proxies?
- Always check for failures from each proxy and stop making requests from that proxy if it stops working. That means the data returned is blank or some error message.
- Use a database to keep track of the status of each proxy.
Can I do (insert something shady here) through a proxy?
- Using a proxy does not give you a free pass to do whatever you want. Your ISP still tracks your traffic and likely other servers along the way do too.
Does a proxy improve performance?
- Proxies will slow you down because they are an extra step. Some are faster than others depending on a host of factors like distance from you.
How can I make sure the proxies are reliable?
- If reliability is important for you, you can create your own proxies. To do this you’ll need servers with different IP addresses. To do this, open a shared hosting account with several different web hosts. Then set up a simple Proxy script on each server. You can then confirm each one is working using the same process outlined above with free proxies. Or you can search for “Proxy Service” and find one that you pay for (and trust!).
What else can I do with a proxy?
- If you have a reliable proxy (like a shared server you pay for) you can send your personal traffic through it. This can help you with the privacy and security of your identity online.
Leave a Reply