{"_id":"himalaya-wxml","_rev":"340166","name":"himalaya-wxml","description":"HTML to JSON parser","dist-tags":{"latest":"1.1.0"},"maintainers":[{"name":"yuche","email":""}],"time":{"modified":"2021-06-20T02:28:49.000Z","created":"2018-10-30T13:02:14.554Z","1.1.0":"2018-10-30T13:02:14.554Z"},"users":{},"author":{"name":"Chris Andrejewski","email":"christopher.andrejewski@gmail.com"},"repository":{"type":"git","url":"git+https://github.com/andrejewski/himalaya.git"},"versions":{"1.1.0":{"name":"himalaya-wxml","description":"HTML to JSON parser","version":"1.1.0","author":{"name":"Chris Andrejewski","email":"christopher.andrejewski@gmail.com"},"ava":{"require":["babel-register"]},"babel":{"presets":["es2015","stage-0"],"plugins":[],"env":{"nyc":{"plugins":["istanbul"]}}},"bugs":{"url":"https://github.com/andrejewski/himalaya/issues"},"devDependencies":{"ava":"^0.25.0","babel-core":"^6.24.0","babel-plugin-istanbul":"^4.0.0","babel-polyfill":"^6.23.0","babel-preset-es2015":"^6.16.0","babel-preset-stage-0":"^6.16.0","babelify":"^8.0.0","browserify":"^16.0.0","coveralls":"^3.0.0","del":"^3.0.0","fixpack":"^2.3.1","gulp":"^3.9.1","gulp-babel":"^7.0.0","gulp-sourcemaps":"^2.1.1","nyc":"^11.0.2","pre-commit":"^1.2.2","pre-push":"^0.1.1","source-map-support":"^0.5.0","standard":"^11.0.0","vinyl-buffer":"^1.0.1","vinyl-source-stream":"^2.0.0"},"homepage":"https://github.com/andrejewski/himalaya","keywords":["ast","html","json","parser"],"license":"ISC","main":"lib/index.js","nyc":{"include":["src/*.js"],"require":["babel-register"],"sourceMap":false,"instrument":false,"reporter":["lcov","text"]},"pre-commit":["prepublish"],"repository":{"type":"git","url":"git+https://github.com/andrejewski/himalaya.git"},"scripts":{"build":"gulp build","coveralls":"npm run report && nyc report --reporter=text-lcov | coveralls","himalaya":"./bin/himalaya.js","report":"NODE_ENV=nyc nyc npm test","test":"fixpack && standard --fix && gulp --silent && ava","test-ci":"npm run prepublish"},"standard":{"ignore":["/docs/dist/**"]},"gitHead":"f0b870011b84da362c863dc914157f30d4a603ac","_id":"himalaya-wxml@1.1.0","_npmVersion":"6.1.0","_nodeVersion":"10.3.0","_npmUser":{"name":"yuche","email":"i@yuche.me"},"dist":{"shasum":"85d0341af1c5f53f3b021be8e4be890cc8b4d7af","size":18532,"noattachment":false,"key":"/himalaya-wxml/-/himalaya-wxml-1.1.0.tgz","tarball":"http://registry.cnpm.dingdandao.com/himalaya-wxml/download/himalaya-wxml-1.1.0.tgz"},"maintainers":[{"name":"yuche","email":""}],"directories":{},"_npmOperationalInternal":{"host":"s3://npm-registry-packages","tmp":"tmp/himalaya-wxml_1.1.0_1540904534338_0.4766605243042026"},"_hasShrinkwrap":false,"publish_time":1540904534554,"_cnpm_publish_time":1540904534554}},"readme":"# Himalaya\n\n> Parse HTML into JSON\n\n[![npm](https://img.shields.io/npm/v/himalaya.svg)](https://www.npmjs.com/package/himalaya)\n[![Build Status](https://travis-ci.org/andrejewski/himalaya.svg?branch=master)](https://travis-ci.org/andrejewski/himalaya)\n[![Coverage Status](https://coveralls.io/repos/github/andrejewski/himalaya/badge.svg?branch=master)](https://coveralls.io/github/andrejewski/himalaya?branch=master)\n[![Greenkeeper badge](https://badges.greenkeeper.io/andrejewski/himalaya.svg)](https://greenkeeper.io/)\n\n[Try online ????](http://andrejewski.github.io/himalaya)\n|\n[Read the specification ????](https://github.com/andrejewski/himalaya/blob/master/text/ast-spec-v1.md)\n\n## Usage\n\n### Node\n```bash\nnpm install himalaya\n```\n\n```js\nimport fs from 'fs'\nimport {parse} from 'himalaya'\nconst html = fs.readFileSync('/webpage.html', {encoding: 'utf8'})\nconst json = parse(html)\nconsole.log('????', json)\n```\n\n### Browser\nDownload [himalaya.js](https://github.com/andrejewski/himalaya/blob/master/docs/dist/himalaya.js) and put it in a `<script>` tag. Himalaya will be accessible from `window.himalaya`.\n\n```js\nconst html = '<div>Hello world</div>'\nconst json = window.himalaya.parse(html)\nconsole.log('????', json)\n```\n\nHimalaya bundles well with Browersify and Webpack.\n\n## Example Input/Output\n\n```html\n<div class='post post-featured'>\n  <p>Himalaya parsed me...</p>\n  <!-- ...and I liked it. -->\n</div>\n```\n\n```js\n[{\n  type: 'element',\n  tagName: 'div',\n  attributes: [{\n    key: 'class',\n    value: 'post post-featured'\n  }],\n  children: [{\n    type: 'element',\n    tagName: 'p',\n    attributes: [],\n    children: [{\n      type: 'text',\n      content: 'Himalaya parsed me...'\n    }]\n  }, {\n    type: 'comment',\n    content: ' ...and I liked it. '\n  }]\n}]\n```\n\n*Note:* In this example, text nodes consisting of whitespace are not shown for readability.\n\n## Features\n\n### Synchronous\nHimalaya transforms HTML into JSON, that's it. Himalaya is synchronous and does not require any complicated callbacks.\n\n### Handles Weirdness\nHimalaya handles a lot of HTML's fringe cases, like:\n- Closes unclosed tags `<p><b>...</p>`\n- Ignores extra closing tags `<span>...</b></span>`\n- Properly handles void tags like `<meta>` and `<img>`\n- Properly handles self-closing tags like `<input/>`\n- Handles `<!doctype>` and `<-- comments -->`\n- Does not parse the contents of `<script>`, `<style>`, and HTML5 `<template>` tags\n\n### Preserves Whitespace\nHimalaya does not cut corners and returns an accurate representation of the HTML supplied. To remove whitespace, post-process the JSON; check out [an example script](https://gist.github.com/andrejewski/773487d4f4a46b16865405d7b74eabf9).\n\n### Line, column, and index positions\nHimalaya can include the start and end positions of nodes in the parse output.\nTo enable this, you can pass `parse` the `parseDefaults` extended with `includePositions: true`:\n\n```js\nimport { parse, parseDefaults } from 'himalaya'\nparse('<img>', { ...parseDefaults, includePositions: true })\n/* =>\n[\n  {\n    \"type\": \"element\",\n    \"tagName\": \"img\",\n    \"attributes\": [],\n    \"children\": [],\n    \"position\": {\n      \"start\": {\n        \"index\": 0,\n        \"line\": 0,\n        \"column\": 0\n      },\n      \"end\": {\n        \"index\": 5,\n        \"line\": 0,\n        \"column\": 5\n      }\n    }\n  }\n]\n*/\n```\n\n## Going back to HTML\nHimalaya provides a `stringify` method. The following example parses the HTML to JSON then parses the JSON back into HTML.\n\n```js\nimport fs from 'fs'\nimport {parse, stringify} from 'himalaya'\n\nconst html = fs.readFileSync('/webpage.html', {encoding: 'utf8'})\nconst json = parse(html)\nfs.writeFileSync('/webpage.html', stringify(json))\n```\n\n## Why \"Himalaya\"?\n\n[First, my friends weren't helpful.](https://twitter.com/compooter/status/597908517132042240) Except Josh, Josh had my back.\n\nWhile I was testing the parser, I threw a download of my Twitter homepage in and got a giant JSON blob out. My code editor Sublime Text has a mini-map and looking at it sideways the data looked like a never-ending mountain range. Also, \"himalaya\" has H, M, L in it.\n","_attachments":{},"homepage":"https://github.com/andrejewski/himalaya","bugs":{"url":"https://github.com/andrejewski/himalaya/issues"},"license":"ISC"}