{"_id":"stream-snitch","_rev":"306818","name":"stream-snitch","description":"Event emitter for watching text streams with regex patterns","dist-tags":{"latest":"0.0.3"},"maintainers":[{"name":"dmotz","email":"motzdc@gmail.com"}],"time":{"modified":"2021-06-03T19:34:13.000Z","created":"2013-12-18T02:14:46.552Z","0.0.3":"2016-09-03T20:40:52.619Z","0.0.2":"2014-09-07T05:36:40.036Z","0.0.1":"2013-12-19T00:08:43.363Z","0.0.0":"2013-12-18T02:14:46.552Z"},"users":{"lapanoid":true,"zvr":true},"author":{"name":"Dan Motzenbecker","email":"dan@oxism.com","url":"http://oxism.com"},"repository":{"type":"git","url":"git+https://github.com/dmotz/stream-snitch.git"},"versions":{"0.0.3":{"name":"stream-snitch","homepage":"http://github.com/dmotz/stream-snitch","author":{"name":"Dan Motzenbecker","email":"dan@oxism.com","url":"http://oxism.com"},"repository":{"type":"git","url":"git+https://github.com/dmotz/stream-snitch.git"},"description":"Event emitter for watching text streams with regex patterns","scripts":{"test":"mocha ./test"},"keywords":["stream","grep","regex","match","search"],"version":"0.0.3","devDependencies":{"mocha":"~3.0.2","should":"~11.1.0"},"gitHead":"a60809096d331c2a283e16903fb3a2be5119bb25","bugs":{"url":"https://github.com/dmotz/stream-snitch/issues"},"_id":"stream-snitch@0.0.3","_shasum":"897a78f13a2714fa844aa77be15477a896d852a9","_from":".","_npmVersion":"3.5.3","_nodeVersion":"5.5.0","_npmUser":{"name":"dmotz","email":"motzdc@gmail.com"},"maintainers":[{"name":"dmotz","email":"motzdc@gmail.com"}],"dist":{"shasum":"897a78f13a2714fa844aa77be15477a896d852a9","size":4422,"noattachment":false,"key":"/stream-snitch/-/stream-snitch-0.0.3.tgz","tarball":"http://registry.cnpm.dingdandao.com/stream-snitch/download/stream-snitch-0.0.3.tgz"},"_npmOperationalInternal":{"host":"packages-16-east.internal.npmjs.com","tmp":"tmp/stream-snitch-0.0.3.tgz_1472935250196_0.6147370284888893"},"directories":{},"publish_time":1472935252619,"_cnpm_publish_time":1472935252619,"_hasShrinkwrap":false},"0.0.2":{"name":"stream-snitch","homepage":"http://github.com/dmotz/stream-snitch","author":{"name":"Dan Motzenbecker","email":"dan@oxism.com","url":"http://oxism.com"},"repository":{"type":"git","url":"http://github.com/dmotz/stream-snitch.git"},"description":"Event emitter for watching text streams with regex patterns","scripts":{"test":"mocha ./test"},"keywords":["stream","grep","regex","match","search"],"version":"0.0.2","engines":{"node":">=0.10.x"},"devDependencies":{"mocha":"1.x.x","should":"4.x.x"},"gitHead":"d2a7514fefa319a3269ba71121de0fc2b364bef2","bugs":{"url":"https://github.com/dmotz/stream-snitch/issues"},"_id":"stream-snitch@0.0.2","_shasum":"b22b1811431f776195afd5c4826ee3b5bc41a806","_from":".","_npmVersion":"1.4.26","_npmUser":{"name":"dmotz","email":"motzdc@gmail.com"},"maintainers":[{"name":"dmotz","email":"motzdc@gmail.com"}],"dist":{"shasum":"b22b1811431f776195afd5c4826ee3b5bc41a806","size":4399,"noattachment":false,"key":"/stream-snitch/-/stream-snitch-0.0.2.tgz","tarball":"http://registry.cnpm.dingdandao.com/stream-snitch/download/stream-snitch-0.0.2.tgz"},"directories":{},"publish_time":1410068200036,"_cnpm_publish_time":1410068200036,"_hasShrinkwrap":false},"0.0.1":{"name":"stream-snitch","homepage":"http://github.com/dmotz/stream-snitch","author":{"name":"Dan Motzenbecker","email":"dan@oxism.com","url":"http://oxism.com"},"repository":{"type":"git","url":"http://github.com/dmotz/stream-snitch.git"},"description":"Event emitter for watching text streams with regex patterns","keywords":["stream","grep","regex","match","search"],"version":"0.0.1","engines":{"node":">=0.10.x"},"readmeFilename":"README.md","bugs":{"url":"https://github.com/dmotz/stream-snitch/issues"},"_id":"stream-snitch@0.0.1","dist":{"shasum":"bd8cd82cc676bb60337676920087954fe20be1c7","size":3761,"noattachment":false,"key":"/stream-snitch/-/stream-snitch-0.0.1.tgz","tarball":"http://registry.cnpm.dingdandao.com/stream-snitch/download/stream-snitch-0.0.1.tgz"},"_from":".","_npmVersion":"1.3.17","_npmUser":{"name":"dmotz","email":"motzdc@gmail.com"},"maintainers":[{"name":"dmotz","email":"motzdc@gmail.com"}],"directories":{},"publish_time":1387411723363,"_cnpm_publish_time":1387411723363,"_hasShrinkwrap":false},"0.0.0":{"name":"stream-snitch","homepage":"http://github.com/dmotz/stream-snitch","author":{"name":"Dan Motzenbecker","email":"dan@oxism.com","url":"http://oxism.com"},"repository":{"type":"git","url":"http://github.com/dmotz/stream-snitch.git"},"description":"Event emitter for watching text streams with regex patterns","keywords":["stream","grep","regex","match","search"],"version":"0.0.0","engines":{"node":">=0.10.x"},"readmeFilename":"README.md","bugs":{"url":"https://github.com/dmotz/stream-snitch/issues"},"_id":"stream-snitch@0.0.0","dist":{"shasum":"6407723dd171bf9082ed0ad6071cb9ba06960efd","size":3650,"noattachment":false,"key":"/stream-snitch/-/stream-snitch-0.0.0.tgz","tarball":"http://registry.cnpm.dingdandao.com/stream-snitch/download/stream-snitch-0.0.0.tgz"},"_from":".","_npmVersion":"1.3.14","_npmUser":{"name":"dmotz","email":"motzdc@gmail.com"},"maintainers":[{"name":"dmotz","email":"motzdc@gmail.com"}],"directories":{},"publish_time":1387332886552,"_cnpm_publish_time":1387332886552,"_hasShrinkwrap":false}},"readme":"# stream-snitch\n#### Event emitter for watching text streams with regex patterns\n[Dan Motzenbecker](http://oxism.com), MIT License\n\n[@dcmotz](http://twitter.com/dcmotz)\n\n### Intro\n\nstream-snitch is a tiny Node module that allows you to match streaming data\npatterns with regular expressions. It's much like `... | grep`, but for Node\nstreams using native events and regular expression objects. It's also a good\nintroduction to the benefits of streams if you're unconvinced or unintroduced.\n\n\n### Use Cases\n\nThe most obvious use case is scraping or crawling documents from an external source.\n\nTypically you might buffer the incoming chunks from a response into a string\nbuffer and then inspect the full response in the response's `end` callback.\n\nFor instance, if you had a function intended to download all image URLs\nembedded in a document:\n\n```javascript\nfunction scrape(url, fn, cb) {\n  http.get(url, function(res) {\n    var data = '';\n    res.on('data', function(chunk) { data += chunk });\n    res.on('end', function() {\n      var rx = /<img.+src=[\"'](.+)['\"].?>/gi, src;\n      while (src = rx.exec(data)) fn(src);\n      cb();\n    });\n  });\n}\n```\n\nOf course, the response could be enormous and bloat your `data` buffer.\nWhat's worse is the response chunks could come slowly and you'd like to perform\nhundreds of these download tasks concurrently and get the job done as quickly\nas possible. Waiting for the entire response to finish negates part of the\nasynchronous benefits Node's model offers and mainly ignores the fact that the\nresponse is a stream object that represents the data in steps as they occur.\n\nHere's the same task with stream-snitch:\n\n```javascript\nfunction scrape(url, fn, cb) {\n  http.get(url, function(res) {\n    var snitch = new StreamSnitch(/<img.+src=[\"'](.+)['\"].?>/gi);\n    snitch.on('match', function(match) { fn(match[1]) });\n    res.pipe(snitch);\n    res.on('end', cb)\n  });\n}\n```\n\nThe image download tasks (represented by `fn`) can occur as sources are found\nwithout having to wait for a potentially huge or slow request to finish first.\nSince you specify native regular expressions, the objects sent to `match`\nlisteners will contain capture group matches as the above demonstrates (`match[1]`).\n\nFor crawling, you could match `href` properties and recursively pipe their\nresponses through more stream-snitch instances.\n\nHere's another example (in CoffeeScript) from\n[soundscrape](https://github.com/dmotz/soundscrape) that matches data from inline JSON:\n\n```coffeescript\nscrape = (page, artist, title) ->\n  http.get \"#{ baseUrl }#{ artist }/#{ title or 'tracks?page=' + page }\", (res) ->\n    snitch = new StreamSnitch /bufferTracks\\.push\\((\\{.+?\\})\\)/g\n    snitch[if title then 'once' else 'on'] 'match', (match) ->\n      download parse match[1]\n      scrape ++page, artist, title unless ++trackCount % 10\n\n    res.pipe snitch\n```\n\n### Usage\n\n```\n$ npm install stream-snitch\n```\n\nCreate a stream-snitch instance with a search pattern, set a `match` callback,\nand pipe some data in:\n\n```javascript\nvar fs           = require('fs'),\n    StreamSnitch = require('stream-snitch'),\n    albumList    = fs.createReadStream('./recently_played_(HUGE).txt'),\n    cosmicSnitch = new StreamSnitch(/^cosmic\\sslop$/mgi);\n\ncosmicSnitch.on('match', console.log.bind(console));\nalbumList.pipe(cosmicSnitch);\n\n```\n\nFor the lazy, you can even specify the `match` event callback in the instantiation:\n```javascript\nvar words = new StreamSnitch(/\\s(\\w+)\\s/g, function(m) { /* ... */ });\n```\n\n### Caveats\n\nstream-snitch is simple internally and uses regular expressions for flexibility,\nrather than more efficient procedural parsing. The first consequence of this is\nthat it only supports streams of text and will decode binary buffers automatically.\n\nSince it offers support for any arbitrary regular expressions including capture\ngroups and start / end operators, chunks are internally buffered and examined and\ndiscarded only when matches are found. When given a regular expression in\nmultiline mode (`/m`), the buffer is cleared at the start of every newline.\n\nstream-snitch will periodically clear its internal buffer if it grows too large,\nwhich could occur if no matches are found over a large amount of data or you use\nan overly broad capture. There is the chance that legitimate match fragments could be\ndiscarded when the water mark is reached unless you specify a large enough buffer\nsize for your needs.\n\nThe default buffer size is one megabyte, but you can pass a custom size like this\nif you anticipate a very large capture size:\n\n```javascript\nnew StreamSnitch(/.../g, { bufferCap: 1024 * 1024 * 20 });\n```\n\nIf you want to reuse a stream-snitch instance after one stream ends, you can\nmanually call the `clearBuffer()` method.\n\nIt should be obvious, but remember to use the `m` (multiline) flag in your patterns\nif you're using the `$` operator for looking at endings on a line by line basis.\nIf you're legitimately looking for a pattern at the end of a document, stream-snitch\nonly offers some advantage over buffering the entire response, in that it periodically\ndiscards chunks from memory.\n\n","_attachments":{},"homepage":"http://github.com/dmotz/stream-snitch","bugs":{"url":"https://github.com/dmotz/stream-snitch/issues"}}