News

GitHub – nfriedly/node-unblocker: Web proxy for evading internet censorship, and general-purpose Node.js library for proxying and rewriting remote webpages

unblocker

Unblocker was in the first place a web proxy for evading internet censoring, similar to CGIproxy / PHProxy / Glype but written in node.js. It ‘s since morphed into a general-purpose library for proxying and rewriting distant webpages .
All data is processed and relayed to the customer on the fly without unnecessary cushion, making unblocker one of the fastest network proxies available .
Node.js CI npm-version

The magic part

The script uses “ pretty ” url which, besides looking pretty, admit links with proportional paths to merely work without modification. ( E.g. )

In addition to this, links that are relative to the beginning ( E.g. ) can be handled without change by checking the referrer and 307 redirecting them to the proper localization in the referring web site. ( Although the proxy does attempt to rewrite these links to avoid the redirect. )
Cookies are proxied by adjusting their path to include the proxy ‘s URL, and a spot of excess work is done to ensure they remain entire when switching protocols or subdomains .

Limitations

Although the proxy works well for standard login forms and flush most AJAX content, OAuth login forms and anything that uses postMessage ( Google, Facebook, etc. ) are not likely to work out of the box. This is not an insuperable write out, but it ‘s not one that I expect to have fixed in the cheeseparing terminus .
More advanced websites, such as Roblox, Discord, YouTube*, Instagram, etc. do not presently work. At the moment, there is no timeframe for when these might be supported .

  • There is an example that detects YouTube video pages and replaces them with a custom page that just streams the video.

Patches are welcome, including both general-purpose improvements to go into the main library, and site-specific fixes to go in the examples folder .

Running the website on your computer

See hypertext transfer protocol : //shoppingandreview.com/nfriedly/nodeunblocker.com

Using unblocker as a library in your software

npm install --save unblocker

Unblocker exports an express -compatible API, thus using in an express application is superficial :

var express = require('express')
var Unblocker = require('unblocker');
var app = express();
var unblocker = new Unblocker({prefix: '/proxy/'});

// this must be one of the first app.use() calls and must not be on a subdirectory to work properly
app.use(unblocker);

app.get('/', function(req, res) {
    //...
});

// the upgrade handler allows unblocker to proxy websockets
app.listen(process.env.PORT || 8080).on('upgrade', unblocker.onUpgrade);

See examples/simple/server.js for a complete exemplar .
usage without carry is similarly easy, see examples/simple/server.js for an example .

Configuration

Unblocker supports the following shape options, defaults are shown :

 {
     prefix:  '/proxy/ ',   // Path that the proxied URLs begin with. '/ ' is not recommended due to a few edge cases .
     host:  null,  // Host used in redirects ( e.g `example.com` or `localhost:8080` ). Default demeanor is to determine this from the request headers .
     requestMiddleware:  [ ],  // Array of functions that perform extra processing on node requests before they are sent to the remote control server. API is detailed below .
     responseMiddleware:  [ ],  // Array of functions that perform extra process on remote control responses before they are sent back to the customer. API is detailed below .
     standardMiddleware:  true,  // Allows you to disable all built-in middleware if you need to perform advanced customization of requests or responses .
     clientScripts:  true,  // Injects JavaScript to force things like WebSockets and XMLHttpRequest to go through the proxy .
     processContentTypes:  [  // All built-in middleware that modifies the subject of responses limits itself to these content-types .
         'text/html ' ,
         'application/xml+xhtml ' ,
         'application/xhtml+xml ' ,
         'text/css '
     ] ,
     httpAgent:  null,  //override agent used to request hypertext transfer protocol reply from waiter. see hypertext transfer protocol : //nodejs.org/api/http.html # http_class_http_agent
     httpsAgent:  null  //override agent used to request hypertext transfer protocol reply from server. see hypertext transfer protocol : //nodejs.org/api/https.html # https_class_https_agent
 }

Setting process.env.NODE_ENV='production' will enable more aggressive hoard on the node scripts and potentially other optimizations in the future .

Custom Middleware

Unblocker “ middleware ” are little functions that allow you to inspect and modify requests and responses. The majority of Unblocker ‘s inner logic is implimented as middleware, and it ‘s possible to write custom middleware to augment or replace the built-in middleware .
Custom middleware should be a function that accepts a single data argument and runs synchronously .
To process request and response data, create a Transform Stream to perform the process in chunks and pipe through this stream. ( example below. )
To respond directly to a request, add a function to config.requestMiddleware that handles the clientResponse ( a standard http.ServerResponse when used immediately, or a Express Response when used with Express. once a response is sent, no further middleware will be executed for that request. ( example below. )

requestMiddleware

Data model :

 {
     url:  'http : //example.com/ ' ,
     clientRequest:  {request } ,
     clientResponse:  {response } ,
     headers:  {
         // ...
     } ,
     stream:  { ReadableStream  of  data  for  PUT / post requests, empty  stream  for  other  types }
 }

requestMiddleware may inspect the headers, url, etc. It can modify headers, pipe PUT/POST data through a transform stream, or answer to the request immediately. If you ‘re using express, the request and reaction objects will have all of the common express goodies. For example :

 affair  validateRequest ( data )  {
     if  ( ! data. url. equal ( 

/

^

hypertext transfer protocol ? :

\/

\/

en.wikipedia.org

\/

/

) ) { data. clientResponse. condition ( 403 ). transport ( 'Wikipedia only. ' ) ; } } volt-ampere config = { requestMiddleware: [ validateRequest ] }

If any piece of middleware sends a response, no promote middleware is run .
After all requestMiddleware has run, the request is forwarded to the outback waiter with the ( potentially modified ) url/headers/stream/etc .

responseMiddleware

responseMiddleware receives the same data object as the requestMiddleware, but the headers and stream fields are replaced with those of the distant server ‘s response, and several new fields are added for the outback request and reply :
Data model :

 {
     url:  'http : //example.com/ ' ,
     clientRequest:  {request } ,
     clientResponse:  {response } ,
    remoteRequest  { request } ,
     remoteResponse:  {response } ,
     contentType:  'text/html ' ,
     headers:  {
         // ...
     } ,
     stream:  { ReadableStream  of  reaction data }
 }

For modifying message, create a newly stream and then pipe data.stream to it and replace data.stream with it :

 volt-ampere  transform  =  command ( 'stream ' ). translate ;

 function  injectScript ( data )  {
     if  ( data. contentType  ==  'text/html ' )  {

         // hypertext transfer protocol : //nodejs.org/api/stream.html # stream_transform
         volt-ampere  myStream  =  raw  transform ( {
             decodeStrings:  fake ,
             function ( ball,  encoding,  future )  {
                 chunk  =  collocate. toString. replace ( '

‘, ‘ ‘ ) ;
this. tug ( collocate ) ;
next ( ) ;
}
} ) ;
data. stream = data. pour. organ pipe ( myStream

) ;
}
}
volt-ampere config = {
responseMiddleware: [
injectScript
]
} See examples/nodeunblocker.com/app.js for another case of adding a morsel of middleware. besides, see any of the built-in middleware in the lib/ booklet .

Built-in Middleware

Most of the internal functionality of the proxy is besides implemented as middleware :

  • host: Corrects the host header in outgoing responses
  • referer: Corrects the referer header in outgoing requests
  • cookies:
    Fixes the Path attribute of set-cookie headers to limit cookies to their “path” on the proxy (e.g. Path=/proxy/http://example.com/).
    Also injects redirects to copy cookies from between protocols and subdomains on a given domain.
  • hsts: Removes Strict-Transport-Security headers because they can leak to other sites and can break the proxy.
  • hpkp: Removes Public-Key-Pinning headers because they can leak to other sites and can break the proxy.
  • csp: Removes Content-Security-Policy headers because they can leak to other sites and can break the proxy.
  • redirects: Rewrites urls in 3xx redirects to ensure they go through the proxy
  • decompress: Decompresses Content-Encoding: gzip|deflate responses and also tweaks request headers to ask for either gzip-only or no compression at all. (It will attempt to decompress deflate content, but there are some issues, so it does not advertise support for deflate.)
  • charsets: Converts the charset of responses to UTF-8 for safe string processing in node.js. Determines charset from headers or meta tags and rewrites all headers and meta tags in outgoing response.
  • urlPrefixer: Rewrites URLS of links/images/css/etc. to ensure they go through the proxy
  • metaRobots: Injects a ROBOTS: NOINDEX, NOFOLLOW meta tag to prevent search engines from crawling the entire web through the proxy.
  • contentLength: Deletes the content-length header on responses if the body was modified.

Setting the standardMiddleware shape option to false disables all built-in middleware, allowing you to selectively enable, configure, and re-order the built-in middleware .
This configuration would mimic the defaults :

 volt-ampere  Unblocker  =  ask ( 'unblocker ' ) ;

 volt-ampere  config  =  {
     prefix:  '/proxy/ ' ,
     host:  null ,
     requestMiddleware:  [ ] ,
     responseMiddleware:  [ ] ,
     standardMiddleware:  assumed,   // disables all built-in middleware
     processContentTypes:  [
         'text/html ' ,
         'application/xml+xhtml ' ,
         'application/xhtml+xml '
     ]
 }

 volt-ampere  host  =  Unblocker. host ( config ) ;
 volt-ampere  referer  =  Unblocker. referer ( config ) ;
 volt-ampere  cookies  =  Unblocker. cookies ( config ) ;
 volt-ampere  hsts  =  Unblocker. hsts ( config ) ;
 volt-ampere  hpkp  =  Unblocker. hpkp ( config ) ;
 volt-ampere  csp  =  Unblocker. csp ( config ) ;
 volt-ampere  redirects  =  Unblocker. redirects ( config ) ;
 volt-ampere  decompress  =  Unblocker. decompress ( config ) ;
 volt-ampere  charsets  =  Unblocker. charsets ( config ) ;
 volt-ampere  urlPrefixer  =  Unblocker. urlPrefixer ( config ) ;
 volt-ampere  metaRobots  =  Unblocker. metaRobots ( config ) ;
 volt-ampere  contentLength  =  Unblocker. contentLength ( config ) ;

 config. requestMiddleware  =  [
     host ,
     referer ,
     relax. handleRequest ,
     cookies. handleRequest
     // custom requestMiddleware here
 ] ;

 config. responseMiddleware  =  [
     hsts ,
     hpkp ,
     csp ,
     redirects ,
     decompress. handleResponse ,
     charsets ,
     urlPrefixer ,
     cookies. handleResponse ,
     metaRobots ,
     // custom responseMiddleware here
     contentLength
 ] ;

 volt-ampere  unblocker  =  new  Unblocker ( config ) ;
 app. use ( unblocker ) ;

 // ...

 // the upgrade handler allows unblocker to proxy websockets
 app. heed ( work. env. port  ||  8080 ). on ( 'upgrade ',  unblocker. onUpgrade ) ;

Debugging

Unblocker is fully instrumented with debug. enable debugging via environment variables :

DEBUG=unblocker:* node mycoolapp.js

There is besides a middleware debugger that adds extra debugging middleware before and after each existing middleware function to report on changes. It ‘s included with the default DEBUG activation and may besides be selectively enabled :

DEBUG=unblocker:middleware node mycoolapp.js

… or disabled :

DEBUG=*,-unblocker:middleware node mycoolapp.js

Troubleshooting

If you ‘re using Nginx as a reverse proxy, you probably need to disable merge_slashes to avoid endless redirects and/or other issues :

merge_slashes off;

Todo

  • Consider adding compress middleware to compress text-like responses
  • Un-prefix urls in GET / POST data
  • Inject js to proxy postMessage data and fix origins
  • More examples
  • Even more tests

AGPL-3.0 License

This project is released under the terms of the GNU Affero General Public License interpretation 3 .
All source code is copyright Nathan Friedly .
commercial license and corroborate are besides available, touch Nathan Friedly ( nathan @ nfriedly.com ) for details .

Contributors

source : https://shoppingandreview.com
Category : News

Related Articles

Back to top button