Very powerful and easy-to-integrate parsing engine able to detect and extract urls from a body of text (capable of parsing text, HTML, JSON, XML etc, and multi-scheme IPv4, IPv6 and textual domains).
Depending on the options specified during each request, the parser is able to find and detect many varied url formats such as:
HTML 5 Scheme - //www.linkedin.com
Usernames - user:email@example.com
Email - firstname.lastname@example.org
IPv4 Address - 192.168.1.1/hello.html
IPv4 Octets - 0x00.0x00.0x00.0x00
IPv4 Decimal - http://123123123123/
IPv6 Address - ftp://[::]/hello
IPv4-mapped IPv6 Address - http://[fe30:4:3:0:18.104.22.168]/
Note: The parser will err on the side of caution and over-detects urls for comprehensiveness, assuming that generally it is better to over-detect than under-detect. It does NOT perform filtering based on Top Level Domain name, so will return things that look like urls, but are not (e.g. http://notrealurl.jpg). If your particular use case requires that valid results are limited to some subset of TLDs, then we recommend that your application filter the Response results locally.
Note also, that instead of complying with RFC 3986 (http://www.ietf.org/rfc/rfc3986.txt), the parser tries to detect based on browser behaviour, optimising detection for urls that are visit-able through the address bar of Chrome, Firefox, Internet Explorer, and Safari.
The parser will return all recognised parts of urls. For example, for the url: http://email@example.com:39000/hello?boo=ff#frag
Scheme - "http"
Username - "user"
Password - null
Host - "linkedin.com"
Port - 39000
Path - "/hello"
Query - "?boo=ff"
Fragment - “#frag”
The Detection Options must be passed in to match the values found at the:
Returns an list of all available url detection options that can be passed during a url detection request. One or more values must be passed, and they are bitwise additive.
Post a body of text, along with any url detection options required, and receive back a list of matching urls detected in the input.