newline battles continue

This commit is contained in:
bel
2020-01-19 20:41:30 +00:00
parent 98adb53caf
commit 573696774e
1456 changed files with 501133 additions and 6 deletions

21
vendor/github.com/mmcdole/gofeed/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2016 mmcdole
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

255
vendor/github.com/mmcdole/gofeed/README.md generated vendored Normal file
View File

@@ -0,0 +1,255 @@
# gofeed
[![Build Status](https://travis-ci.org/mmcdole/gofeed.svg?branch=master)](https://travis-ci.org/mmcdole/gofeed) [![Coverage Status](https://coveralls.io/repos/github/mmcdole/gofeed/badge.svg?branch=master)](https://coveralls.io/github/mmcdole/gofeed?branch=master) [![Go Report Card](https://goreportcard.com/badge/github.com/mmcdole/gofeed)](https://goreportcard.com/report/github.com/mmcdole/gofeed) [![](https://godoc.org/github.com/mmcdole/gofeed?status.svg)](http://godoc.org/github.com/mmcdole/gofeed) [![License](http://img.shields.io/:license-mit-blue.svg)](http://doge.mit-license.org)
The `gofeed` library is a robust feed parser that supports parsing both [RSS](https://en.wikipedia.org/wiki/RSS) and [Atom](https://en.wikipedia.org/wiki/Atom_(standard)) feeds. The library provides a universal `gofeed.Parser` that will parse and convert all feed types into a hybrid `gofeed.Feed` model. You also have the option of utilizing the feed specific `atom.Parser` or `rss.Parser` parsers which generate `atom.Feed` and `rss.Feed` respectively.
## Table of Contents
- [Features](#features)
- [Overview](#overview)
- [Basic Usage](#basic-usage)
- [Advanced Usage](#advanced-usage)
- [Extensions](#extensions)
- [Invalid Feeds](#invalid-feeds)
- [Default Mappings](#default-mappings)
- [Dependencies](#dependencies)
- [License](#license)
- [Credits](#credits)
## Features
#### Supported feed types:
* RSS 0.90
* Netscape RSS 0.91
* Userland RSS 0.91
* RSS 0.92
* RSS 0.93
* RSS 0.94
* RSS 1.0
* RSS 2.0
* Atom 0.3
* Atom 1.0
#### Extension Support
The `gofeed` library provides support for parsing several popular predefined extensions into ready-made structs, including [Dublin Core](http://dublincore.org/documents/dces/) and [Apples iTunes](https://help.apple.com/itc/podcasts_connect/#/itcb54353390).
It parses all other feed extensions in a generic way (see the [Extensions](#extensions) section for more details).
#### Invalid Feeds
A best-effort attempt is made at parsing broken and invalid XML feeds. Currently, `gofeed` can succesfully parse feeds with the following issues:
- Unescaped/Naked Markup in feed elements
- Undeclared namespace prefixes
- Missing closing tags on certain elements
- Illegal tags within feed elements without namespace prefixes
- Missing "required" elements as specified by the respective feed specs.
- Incorrect date formats
## Overview
The `gofeed` library is comprised of a universal feed parser and several feed specific parsers. Which one you choose depends entirely on your usecase. If you will be handling both rss and atom feeds then it makes sense to use the `gofeed.Parser`. If you know ahead of time that you will only be parsing one feed type then it would make sense to use `rss.Parser` or `atom.Parser`.
#### Universal Feed Parser
The universal `gofeed.Parser` works in 3 stages: detection, parsing and translation. It first detects the feed type that it is currently parsing. Then it uses a feed specific parser to parse the feed into its true representation which will be either a `rss.Feed` or `atom.Feed`. These models cover every field possible for their respective feed types. Finally, they are *translated* into a `gofeed.Feed` model that is a hybrid of both feed types. Performing the universal feed parsing in these 3 stages allows for more flexibility and keeps the code base more maintainable by separating RSS and Atom parsing into seperate packages.
![Diagram](docs/sequence.png)
The translation step is done by anything which adheres to the `gofeed.Translator` interface. The `DefaultRSSTranslator` and `DefaultAtomTranslator` are used behind the scenes when you use the `gofeed.Parser` with its default settings. You can see how they translate fields from ```atom.Feed``` or ```rss.Feed``` to the universal ```gofeed.Feed``` struct in the [Default Mappings](#default-mappings) section. However, should you disagree with the way certain fields are translated you can easily supply your own `gofeed.Translator` and override this behavior. See the [Advanced Usage](#advanced-usage) section for an example how to do this.
#### Feed Specific Parsers
The `gofeed` library provides two feed specific parsers: `atom.Parser` and `rss.Parser`. If the hybrid `gofeed.Feed` model that the universal `gofeed.Parser` produces does not contain a field from the `atom.Feed` or `rss.Feed` model that you require, it might be beneficial to use the feed specific parsers. When using the `atom.Parser` or `rss.Parser` directly, you can access all of fields found in the `atom.Feed` and `rss.Feed` models. It is also marginally faster because you are able to skip the translation step.
## Basic Usage
#### Universal Feed Parser
The most common usage scenario will be to use ```gofeed.Parser``` to parse an arbitrary RSS or Atom feed into the hybrid ```gofeed.Feed``` model. This hybrid model allows you to treat RSS and Atom feeds the same.
##### Parse a feed from an URL:
```go
fp := gofeed.NewParser()
feed, _ := fp.ParseURL("http://feeds.twit.tv/twit.xml")
fmt.Println(feed.Title)
```
##### Parse a feed from a string:
```go
feedData := `<rss version="2.0">
<channel>
<title>Sample Feed</title>
</channel>
</rss>`
fp := gofeed.NewParser()
feed, _ := fp.ParseString(feedData)
fmt.Println(feed.Title)
```
##### Parse a feed from an io.Reader:
```go
file, _ := os.Open("/path/to/a/file.xml")
defer file.Close()
fp := gofeed.NewParser()
feed, _ := fp.Parse(file)
fmt.Println(feed.Title)
```
#### Feed Specific Parsers
You can easily use the `rss.Parser` and `atom.Parser` directly if you have a usage scenario that requires it:
##### Parse a RSS feed into a `rss.Feed`
```go
feedData := `<rss version="2.0">
<channel>
<webMaster>example@site.com (Example Name)</webMaster>
</channel>
</rss>`
fp := rss.Parser{}
rssFeed, _ := fp.Parse(strings.NewReader(feedData))
fmt.Println(rssFeed.WebMaster)
```
##### Parse an Atom feed into a `atom.Feed`
```go
feedData := `<feed xmlns="http://www.w3.org/2005/Atom">
<subtitle>Example Atom</subtitle>
</feed>`
fp := atom.Parser{}
atomFeed, _ := fp.Parse(strings.NewReader(feedData))
fmt.Println(atomFeed.Subtitle)
```
## Advanced Usage
##### Parse a feed while using a custom translator
The mappings and precedence order that are outlined in the [Default Mappings](#default-mappings) section are provided by the following two structs: `DefaultRSSTranslator` and `DefaultAtomTranslator`. If you have fields that you think should have a different precedence, or if you want to make a translator that is aware of an unsupported extension you can do this by specifying your own RSS or Atom translator when using the `gofeed.Parser`.
Here is a simple example of creating a custom `Translator` that makes the `/rss/channel/itunes:author` field have a higher precedence than the `/rss/channel/managingEditor` field in RSS feeds. We will wrap the existing `DefaultRSSTranslator` since we only want to change the behavior for a single field.
First we must define a custom translator:
```go
import (
"fmt"
"github.com/mmcdole/gofeed"
"github.com/mmcdole/gofeed/rss"
)
type MyCustomTranslator struct {
defaultTranslator *gofeed.DefaultRSSTranslator
}
func NewMyCustomTranslator() *MyCustomTranslator {
t := &MyCustomTranslator{}
// We create a DefaultRSSTranslator internally so we can wrap its Translate
// call since we only want to modify the precedence for a single field.
t.defaultTranslator = &gofeed.DefaultRSSTranslator{}
return t
}
func (ct* MyCustomTranslator) Translate(feed interface{}) (*gofeed.Feed, error) {
rss, found := feed.(*rss.Feed)
if !found {
return nil, fmt.Errorf("Feed did not match expected type of *rss.Feed")
}
f, err := ct.defaultTranslator.Translate(rss)
if err != nil {
return nil, err
}
if rss.ITunesExt != nil && rss.ITunesExt.Author != "" {
f.Author = rss.ITunesExt.Author
} else {
f.Author = rss.ManagingEditor
}
return f
}
```
Next you must configure your `gofeed.Parser` to utilize the new `gofeed.Translator`:
```go
feedData := `<rss version="2.0">
<channel>
<managingEditor>Ender Wiggin</managingEditor>
<itunes:author>Valentine Wiggin</itunes:author>
</channel>
</rss>`
fp := gofeed.NewParser()
fp.RSSTranslator = NewMyCustomTranslator()
feed, _ := fp.ParseString(feedData)
fmt.Println(feed.Author) // Valentine Wiggin
```
## Extensions
Every element which does not belong to the feed's default namespace is considered an extension by `gofeed`. These are parsed and stored in a tree-like structure located at `Feed.Extensions` and `Item.Extensions`. These fields should allow you to access and read any custom extension elements.
In addition to the generic handling of extensions, `gofeed` also has built in support for parsing certain popular extensions into their own structs for convenience. It currently supports the [Dublin Core](http://dublincore.org/documents/dces/) and [Apple iTunes](https://help.apple.com/itc/podcasts_connect/#/itcb54353390) extensions which you can access at `Feed.ItunesExt`, `feed.DublinCoreExt` and `Item.ITunesExt` and `Item.DublinCoreExt`
## Default Mappings
The ```DefaultRSSTranslator``` and the ```DefaultAtomTranslator``` map the following ```rss.Feed``` and ```atom.Feed``` fields to their respective ```gofeed.Feed``` fields. They are listed in order of precedence (highest to lowest):
`gofeed.Feed` | RSS | Atom
--- | --- | ---
Title | /rss/channel/title<br>/rdf:RDF/channel/title<br>/rss/channel/dc:title<br>/rdf:RDF/channel/dc:title | /feed/title
Description | /rss/channel/description<br>/rdf:RDF/channel/description<br>/rss/channel/itunes:subtitle | /feed/subtitle<br>/feed/tagline
Link | /rss/channel/link<br>/rdf:RDF/channel/link | /feed/link[@rel=”alternate”]/@href<br>/feed/link[not(@rel)]/@href
FeedLink | /rss/channel/atom:link[@rel="self"]/@href<br>/rdf:RDF/channel/atom:link[@rel="self"]/@href | /feed/link[@rel="self"]/@href
Updated | /rss/channel/lastBuildDate<br>/rss/channel/dc:date<br>/rdf:RDF/channel/dc:date | /feed/updated<br>/feed/modified
Published | /rss/channel/pubDate |
Author | /rss/channel/managingEditor<br>/rss/channel/webMaster<br>/rss/channel/dc:author<br>/rdf:RDF/channel/dc:author<br>/rss/channel/dc:creator<br>/rdf:RDF/channel/dc:creator<br>/rss/channel/itunes:author | /feed/author
Language | /rss/channel/language<br>/rss/channel/dc:language<br>/rdf:RDF/channel/dc:language | /feed/@xml:lang
Image | /rss/channel/image<br>/rdf:RDF/image<br>/rss/channel/itunes:image | /feed/logo
Copyright | /rss/channel/copyright<br>/rss/channel/dc:rights<br>/rdf:RDF/channel/dc:rights | /feed/rights<br>/feed/copyright
Generator | /rss/channel/generator | /feed/generator
Categories | /rss/channel/category<br>/rss/channel/itunes:category<br>/rss/channel/itunes:keywords<br>/rss/channel/dc:subject<br>/rdf:RDF/channel/dc:subject | /feed/category
`gofeed.Item` | RSS | Atom
--- | --- | ---
Title | /rss/channel/item/title<br>/rdf:RDF/item/title<br>/rdf:RDF/item/dc:title<br>/rss/channel/item/dc:title | /feed/entry/title
Description | /rss/channel/item/description<br>/rdf:RDF/item/description<br>/rss/channel/item/dc:description<br>/rdf:RDF/item/dc:description | /feed/entry/summary
Content | /rss/channel/item/content:encoded | /feed/entry/content
Link | /rss/channel/item/link<br>/rdf:RDF/item/link | /feed/entry/link[@rel=”alternate”]/@href<br>/feed/entry/link[not(@rel)]/@href
Updated | /rss/channel/item/dc:date<br>/rdf:RDF/rdf:item/dc:date | /feed/entry/modified<br>/feed/entry/updated
Published | /rss/channel/item/pubDate<br>/rss/channel/item/dc:date | /feed/entry/published<br>/feed/entry/issued
Author | /rss/channel/item/author<br>/rss/channel/item/dc:author<br>/rdf:RDF/item/dc:author<br>/rss/channel/item/dc:creator<br>/rdf:RDF/item/dc:creator<br>/rss/channel/item/itunes:author | /feed/entry/author
GUID | /rss/channel/item/guid | /feed/entry/id
Image | /rss/channel/item/itunes:image<br>/rss/channel/item/media:image |
Categories | /rss/channel/item/category<br>/rss/channel/item/dc:subject<br>/rss/channel/item/itunes:keywords<br>/rdf:RDF/channel/item/dc:subject | /feed/entry/category
Enclosures | /rss/channel/item/enclosure | /feed/entry/link[@rel=”enclosure”]
## Dependencies
* [goxpp](https://github.com/mmcdole/goxpp) - XML Pull Parser
* [goquery](https://github.com/PuerkitoBio/goquery) - Go jQuery-like interface
* [testify](https://github.com/stretchr/testify) - Unit test enhancements
## License
This project is licensed under the [MIT License](https://raw.githubusercontent.com/mmcdole/gofeed/master/LICENSE)
## Credits
* [cristoper](https://github.com/cristoper) for his work on implementing xml:base relative URI handling.
* [Mark Pilgrim](https://en.wikipedia.org/wiki/Mark_Pilgrim) and [Kurt McKee](http://kurtmckee.org) for their work on the excellent [Universal Feed Parser](https://github.com/kurtmckee/feedparser) Python library. This library was the inspiration for the `gofeed` library.
* [Dan MacTough](http://blog.mact.me) for his work on [node-feedparser](https://github.com/danmactough/node-feedparser). It provided inspiration for the set of fields that should be covered in the hybrid `gofeed.Feed` model.
* [Matt Jibson](https://mattjibson.com/) for his date parsing function in the [goread](https://github.com/mjibson/goread) project.
* [Jim Teeuwen](https://github.com/jteeuwen) for his method of representing arbitrary feed extensions in the [go-pkg-rss](https://github.com/jteeuwen/go-pkg-rss) library.

114
vendor/github.com/mmcdole/gofeed/atom/feed.go generated vendored Normal file
View File

@@ -0,0 +1,114 @@
package atom
import (
"encoding/json"
"time"
"github.com/mmcdole/gofeed/extensions"
)
// Feed is an Atom Feed
type Feed struct {
Title string `json:"title,omitempty"`
ID string `json:"id,omitempty"`
Updated string `json:"updated,omitempty"`
UpdatedParsed *time.Time `json:"updatedParsed,omitempty"`
Subtitle string `json:"subtitle,omitempty"`
Links []*Link `json:"links,omitempty"`
Language string `json:"language,omitempty"`
Generator *Generator `json:"generator,omitempty"`
Icon string `json:"icon,omitempty"`
Logo string `json:"logo,omitempty"`
Rights string `json:"rights,omitempty"`
Contributors []*Person `json:"contributors,omitempty"`
Authors []*Person `json:"authors,omitempty"`
Categories []*Category `json:"categories,omitempty"`
Entries []*Entry `json:"entries"`
Extensions ext.Extensions `json:"extensions,omitempty"`
Version string `json:"version"`
}
func (f Feed) String() string {
json, _ := json.MarshalIndent(f, "", " ")
return string(json)
}
// Entry is an Atom Entry
type Entry struct {
Title string `json:"title,omitempty"`
ID string `json:"id,omitempty"`
Updated string `json:"updated,omitempty"`
UpdatedParsed *time.Time `json:"updatedParsed,omitempty"`
Summary string `json:"summary,omitempty"`
Authors []*Person `json:"authors,omitempty"`
Contributors []*Person `json:"contributors,omitempty"`
Categories []*Category `json:"categories,omitempty"`
Links []*Link `json:"links,omitempty"`
Rights string `json:"rights,omitempty"`
Published string `json:"published,omitempty"`
PublishedParsed *time.Time `json:"publishedParsed,omitempty"`
Source *Source `json:"source,omitempty"`
Content *Content `json:"content,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
}
// Category is category metadata for Feeds and Entries
type Category struct {
Term string `json:"term,omitempty"`
Scheme string `json:"scheme,omitempty"`
Label string `json:"label,omitempty"`
}
// Person represents a person in an Atom feed
// for things like Authors, Contributors, etc
type Person struct {
Name string `json:"name,omitempty"`
Email string `json:"email,omitempty"`
URI string `json:"uri,omitempty"`
}
// Link is an Atom link that defines a reference
// from an entry or feed to a Web resource
type Link struct {
Href string `json:"href,omitempty"`
Hreflang string `json:"hreflang,omitempty"`
Rel string `json:"rel,omitempty"`
Type string `json:"type,omitempty"`
Title string `json:"title,omitempty"`
Length string `json:"length,omitempty"`
}
// Content either contains or links to the content of
// the entry
type Content struct {
Src string `json:"src,omitempty"`
Type string `json:"type,omitempty"`
Value string `json:"value,omitempty"`
}
// Generator identifies the agent used to generate a
// feed, for debugging and other purposes.
type Generator struct {
Value string `json:"value,omitempty"`
URI string `json:"uri,omitempty"`
Version string `json:"version,omitempty"`
}
// Source contains the feed information for another
// feed if a given entry came from that feed.
type Source struct {
Title string `json:"title,omitempty"`
ID string `json:"id,omitempty"`
Updated string `json:"updated,omitempty"`
UpdatedParsed *time.Time `json:"updatedParsed,omitempty"`
Subtitle string `json:"subtitle,omitempty"`
Links []*Link `json:"links,omitempty"`
Generator *Generator `json:"generator,omitempty"`
Icon string `json:"icon,omitempty"`
Logo string `json:"logo,omitempty"`
Rights string `json:"rights,omitempty"`
Contributors []*Person `json:"contributors,omitempty"`
Authors []*Person `json:"authors,omitempty"`
Categories []*Category `json:"categories,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
}

761
vendor/github.com/mmcdole/gofeed/atom/parser.go generated vendored Normal file
View File

@@ -0,0 +1,761 @@
package atom
import (
"encoding/base64"
"io"
"strings"
"github.com/PuerkitoBio/goquery"
ext "github.com/mmcdole/gofeed/extensions"
"github.com/mmcdole/gofeed/internal/shared"
xpp "github.com/mmcdole/goxpp"
)
var (
// Atom elements which contain URIs
// https://tools.ietf.org/html/rfc4287
uriElements = map[string]bool{
"icon": true,
"id": true,
"logo": true,
"uri": true,
"url": true, // atom 0.3
}
// Atom attributes which contain URIs
// https://tools.ietf.org/html/rfc4287
atomURIAttrs = map[string]bool{
"href": true,
"scheme": true,
"src": true,
"uri": true,
}
)
// Parser is an Atom Parser
type Parser struct {
base *shared.XMLBase
}
// Parse parses an xml feed into an atom.Feed
func (ap *Parser) Parse(feed io.Reader) (*Feed, error) {
p := xpp.NewXMLPullParser(feed, false, shared.NewReaderLabel)
ap.base = &shared.XMLBase{URIAttrs: atomURIAttrs}
_, err := ap.base.FindRoot(p)
if err != nil {
return nil, err
}
return ap.parseRoot(p)
}
func (ap *Parser) parseRoot(p *xpp.XMLPullParser) (*Feed, error) {
if err := p.Expect(xpp.StartTag, "feed"); err != nil {
return nil, err
}
atom := &Feed{}
atom.Entries = []*Entry{}
atom.Version = ap.parseVersion(p)
atom.Language = ap.parseLanguage(p)
contributors := []*Person{}
authors := []*Person{}
categories := []*Category{}
links := []*Link{}
extensions := ext.Extensions{}
for {
tok, err := ap.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if shared.IsExtension(p) {
e, err := shared.ParseExtension(extensions, p)
if err != nil {
return nil, err
}
extensions = e
} else if name == "title" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Title = result
} else if name == "id" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.ID = result
} else if name == "updated" ||
name == "modified" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Updated = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
atom.UpdatedParsed = &utcDate
}
} else if name == "subtitle" ||
name == "tagline" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Subtitle = result
} else if name == "link" {
result, err := ap.parseLink(p)
if err != nil {
return nil, err
}
links = append(links, result)
} else if name == "generator" {
result, err := ap.parseGenerator(p)
if err != nil {
return nil, err
}
atom.Generator = result
} else if name == "icon" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Icon = result
} else if name == "logo" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Logo = result
} else if name == "rights" ||
name == "copyright" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
atom.Rights = result
} else if name == "contributor" {
result, err := ap.parsePerson("contributor", p)
if err != nil {
return nil, err
}
contributors = append(contributors, result)
} else if name == "author" {
result, err := ap.parsePerson("author", p)
if err != nil {
return nil, err
}
authors = append(authors, result)
} else if name == "category" {
result, err := ap.parseCategory(p)
if err != nil {
return nil, err
}
categories = append(categories, result)
} else if name == "entry" {
result, err := ap.parseEntry(p)
if err != nil {
return nil, err
}
atom.Entries = append(atom.Entries, result)
} else {
err := p.Skip()
if err != nil {
return nil, err
}
}
}
}
if len(categories) > 0 {
atom.Categories = categories
}
if len(authors) > 0 {
atom.Authors = authors
}
if len(contributors) > 0 {
atom.Contributors = contributors
}
if len(links) > 0 {
atom.Links = links
}
if len(extensions) > 0 {
atom.Extensions = extensions
}
if err := p.Expect(xpp.EndTag, "feed"); err != nil {
return nil, err
}
return atom, nil
}
func (ap *Parser) parseEntry(p *xpp.XMLPullParser) (*Entry, error) {
if err := p.Expect(xpp.StartTag, "entry"); err != nil {
return nil, err
}
entry := &Entry{}
contributors := []*Person{}
authors := []*Person{}
categories := []*Category{}
links := []*Link{}
extensions := ext.Extensions{}
for {
tok, err := ap.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if shared.IsExtension(p) {
e, err := shared.ParseExtension(extensions, p)
if err != nil {
return nil, err
}
extensions = e
} else if name == "title" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.Title = result
} else if name == "id" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.ID = result
} else if name == "rights" ||
name == "copyright" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.Rights = result
} else if name == "summary" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.Summary = result
} else if name == "source" {
result, err := ap.parseSource(p)
if err != nil {
return nil, err
}
entry.Source = result
} else if name == "updated" ||
name == "modified" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.Updated = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
entry.UpdatedParsed = &utcDate
}
} else if name == "contributor" {
result, err := ap.parsePerson("contributor", p)
if err != nil {
return nil, err
}
contributors = append(contributors, result)
} else if name == "author" {
result, err := ap.parsePerson("author", p)
if err != nil {
return nil, err
}
authors = append(authors, result)
} else if name == "category" {
result, err := ap.parseCategory(p)
if err != nil {
return nil, err
}
categories = append(categories, result)
} else if name == "link" {
result, err := ap.parseLink(p)
if err != nil {
return nil, err
}
links = append(links, result)
} else if name == "published" ||
name == "issued" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
entry.Published = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
entry.PublishedParsed = &utcDate
}
} else if name == "content" {
result, err := ap.parseContent(p)
if err != nil {
return nil, err
}
entry.Content = result
} else {
err := p.Skip()
if err != nil {
return nil, err
}
}
}
}
if len(categories) > 0 {
entry.Categories = categories
}
if len(authors) > 0 {
entry.Authors = authors
}
if len(links) > 0 {
entry.Links = links
}
if len(contributors) > 0 {
entry.Contributors = contributors
}
if len(extensions) > 0 {
entry.Extensions = extensions
}
if err := p.Expect(xpp.EndTag, "entry"); err != nil {
return nil, err
}
return entry, nil
}
func (ap *Parser) parseSource(p *xpp.XMLPullParser) (*Source, error) {
if err := p.Expect(xpp.StartTag, "source"); err != nil {
return nil, err
}
source := &Source{}
contributors := []*Person{}
authors := []*Person{}
categories := []*Category{}
links := []*Link{}
extensions := ext.Extensions{}
for {
tok, err := ap.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if shared.IsExtension(p) {
e, err := shared.ParseExtension(extensions, p)
if err != nil {
return nil, err
}
extensions = e
} else if name == "title" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Title = result
} else if name == "id" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.ID = result
} else if name == "updated" ||
name == "modified" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Updated = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
source.UpdatedParsed = &utcDate
}
} else if name == "subtitle" ||
name == "tagline" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Subtitle = result
} else if name == "link" {
result, err := ap.parseLink(p)
if err != nil {
return nil, err
}
links = append(links, result)
} else if name == "generator" {
result, err := ap.parseGenerator(p)
if err != nil {
return nil, err
}
source.Generator = result
} else if name == "icon" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Icon = result
} else if name == "logo" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Logo = result
} else if name == "rights" ||
name == "copyright" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
source.Rights = result
} else if name == "contributor" {
result, err := ap.parsePerson("contributor", p)
if err != nil {
return nil, err
}
contributors = append(contributors, result)
} else if name == "author" {
result, err := ap.parsePerson("author", p)
if err != nil {
return nil, err
}
authors = append(authors, result)
} else if name == "category" {
result, err := ap.parseCategory(p)
if err != nil {
return nil, err
}
categories = append(categories, result)
} else {
err := p.Skip()
if err != nil {
return nil, err
}
}
}
}
if len(categories) > 0 {
source.Categories = categories
}
if len(authors) > 0 {
source.Authors = authors
}
if len(contributors) > 0 {
source.Contributors = contributors
}
if len(links) > 0 {
source.Links = links
}
if len(extensions) > 0 {
source.Extensions = extensions
}
if err := p.Expect(xpp.EndTag, "source"); err != nil {
return nil, err
}
return source, nil
}
func (ap *Parser) parseContent(p *xpp.XMLPullParser) (*Content, error) {
c := &Content{}
c.Type = p.Attribute("type")
c.Src = p.Attribute("src")
text, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
c.Value = text
return c, nil
}
func (ap *Parser) parsePerson(name string, p *xpp.XMLPullParser) (*Person, error) {
if err := p.Expect(xpp.StartTag, name); err != nil {
return nil, err
}
person := &Person{}
for {
tok, err := ap.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if name == "name" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
person.Name = result
} else if name == "email" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
person.Email = result
} else if name == "uri" ||
name == "url" ||
name == "homepage" {
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
person.URI = result
} else {
err := p.Skip()
if err != nil {
return nil, err
}
}
}
}
if err := p.Expect(xpp.EndTag, name); err != nil {
return nil, err
}
return person, nil
}
func (ap *Parser) parseLink(p *xpp.XMLPullParser) (*Link, error) {
if err := p.Expect(xpp.StartTag, "link"); err != nil {
return nil, err
}
l := &Link{}
l.Href = p.Attribute("href")
l.Hreflang = p.Attribute("hreflang")
l.Type = p.Attribute("type")
l.Length = p.Attribute("length")
l.Title = p.Attribute("title")
l.Rel = p.Attribute("rel")
if l.Rel == "" {
l.Rel = "alternate"
}
if err := p.Skip(); err != nil {
return nil, err
}
if err := p.Expect(xpp.EndTag, "link"); err != nil {
return nil, err
}
return l, nil
}
func (ap *Parser) parseCategory(p *xpp.XMLPullParser) (*Category, error) {
if err := p.Expect(xpp.StartTag, "category"); err != nil {
return nil, err
}
c := &Category{}
c.Term = p.Attribute("term")
c.Scheme = p.Attribute("scheme")
c.Label = p.Attribute("label")
if err := p.Skip(); err != nil {
return nil, err
}
if err := p.Expect(xpp.EndTag, "category"); err != nil {
return nil, err
}
return c, nil
}
func (ap *Parser) parseGenerator(p *xpp.XMLPullParser) (*Generator, error) {
if err := p.Expect(xpp.StartTag, "generator"); err != nil {
return nil, err
}
g := &Generator{}
uri := p.Attribute("uri") // Atom 1.0
url := p.Attribute("url") // Atom 0.3
if uri != "" {
g.URI = uri
} else if url != "" {
g.URI = url
}
g.Version = p.Attribute("version")
result, err := ap.parseAtomText(p)
if err != nil {
return nil, err
}
g.Value = result
if err := p.Expect(xpp.EndTag, "generator"); err != nil {
return nil, err
}
return g, nil
}
func (ap *Parser) parseAtomText(p *xpp.XMLPullParser) (string, error) {
var text struct {
Type string `xml:"type,attr"`
Mode string `xml:"mode,attr"`
InnerXML string `xml:",innerxml"`
}
err := p.DecodeElement(&text)
if err != nil {
return "", err
}
result := text.InnerXML
result = strings.TrimSpace(result)
lowerType := strings.ToLower(text.Type)
lowerMode := strings.ToLower(text.Mode)
if strings.Contains(result, "<![CDATA[") {
result = shared.StripCDATA(result)
if lowerType == "html" || strings.Contains(lowerType, "xhtml") {
result, _ = ap.base.ResolveHTML(result)
}
} else {
// decode non-CDATA contents depending on type
if lowerType == "text" ||
strings.HasPrefix(lowerType, "text/") ||
(lowerType == "" && lowerMode == "") {
result, err = shared.DecodeEntities(result)
} else if strings.Contains(lowerType, "xhtml") {
result = ap.stripWrappingDiv(result)
result, _ = ap.base.ResolveHTML(result)
} else if lowerType == "html" {
result = ap.stripWrappingDiv(result)
result, err = shared.DecodeEntities(result)
if err == nil {
result, _ = ap.base.ResolveHTML(result)
}
} else {
decodedStr, err := base64.StdEncoding.DecodeString(result)
if err == nil {
result = string(decodedStr)
}
}
}
// resolve relative URIs in URI-containing elements according to xml:base
name := strings.ToLower(p.Name)
if uriElements[name] {
resolved, err := ap.base.ResolveURL(result)
if err == nil {
result = resolved
}
}
return result, err
}
func (ap *Parser) parseLanguage(p *xpp.XMLPullParser) string {
return p.Attribute("lang")
}
func (ap *Parser) parseVersion(p *xpp.XMLPullParser) string {
ver := p.Attribute("version")
if ver != "" {
return ver
}
ns := p.Attribute("xmlns")
if ns == "http://purl.org/atom/ns#" {
return "0.3"
}
if ns == "http://www.w3.org/2005/Atom" {
return "1.0"
}
return ""
}
func (ap *Parser) stripWrappingDiv(content string) (result string) {
result = content
r := strings.NewReader(result)
doc, err := goquery.NewDocumentFromReader(r)
if err == nil {
root := doc.Find("body").Children()
if root.Is("div") && root.Siblings().Size() == 0 {
html, err := root.Unwrap().Html()
if err == nil {
result = html
}
}
}
return
}

48
vendor/github.com/mmcdole/gofeed/detector.go generated vendored Normal file
View File

@@ -0,0 +1,48 @@
package gofeed
import (
"io"
"strings"
"github.com/mmcdole/gofeed/internal/shared"
xpp "github.com/mmcdole/goxpp"
)
// FeedType represents one of the possible feed
// types that we can detect.
type FeedType int
const (
// FeedTypeUnknown represents a feed that could not have its
// type determiend.
FeedTypeUnknown FeedType = iota
// FeedTypeAtom repesents an Atom feed
FeedTypeAtom
// FeedTypeRSS represents an RSS feed
FeedTypeRSS
)
// DetectFeedType attempts to determine the type of feed
// by looking for specific xml elements unique to the
// various feed types.
func DetectFeedType(feed io.Reader) FeedType {
p := xpp.NewXMLPullParser(feed, false, shared.NewReaderLabel)
xmlBase := shared.XMLBase{}
_, err := xmlBase.FindRoot(p)
if err != nil {
return FeedTypeUnknown
}
name := strings.ToLower(p.Name)
switch name {
case "rdf":
return FeedTypeRSS
case "rss":
return FeedTypeRSS
case "feed":
return FeedTypeAtom
default:
return FeedTypeUnknown
}
}

View File

@@ -0,0 +1,45 @@
package ext
// DublinCoreExtension represents a feed extension
// for the Dublin Core specification.
type DublinCoreExtension struct {
Title []string `json:"title,omitempty"`
Creator []string `json:"creator,omitempty"`
Author []string `json:"author,omitempty"`
Subject []string `json:"subject,omitempty"`
Description []string `json:"description,omitempty"`
Publisher []string `json:"publisher,omitempty"`
Contributor []string `json:"contributor,omitempty"`
Date []string `json:"date,omitempty"`
Type []string `json:"type,omitempty"`
Format []string `json:"format,omitempty"`
Identifier []string `json:"identifier,omitempty"`
Source []string `json:"source,omitempty"`
Language []string `json:"language,omitempty"`
Relation []string `json:"relation,omitempty"`
Coverage []string `json:"coverage,omitempty"`
Rights []string `json:"rights,omitempty"`
}
// NewDublinCoreExtension creates a new DublinCoreExtension
// given the generic extension map for the "dc" prefix.
func NewDublinCoreExtension(extensions map[string][]Extension) *DublinCoreExtension {
dc := &DublinCoreExtension{}
dc.Title = parseTextArrayExtension("title", extensions)
dc.Creator = parseTextArrayExtension("creator", extensions)
dc.Author = parseTextArrayExtension("author", extensions)
dc.Subject = parseTextArrayExtension("subject", extensions)
dc.Description = parseTextArrayExtension("description", extensions)
dc.Publisher = parseTextArrayExtension("publisher", extensions)
dc.Contributor = parseTextArrayExtension("contributor", extensions)
dc.Date = parseTextArrayExtension("date", extensions)
dc.Type = parseTextArrayExtension("type", extensions)
dc.Format = parseTextArrayExtension("format", extensions)
dc.Identifier = parseTextArrayExtension("identifier", extensions)
dc.Source = parseTextArrayExtension("source", extensions)
dc.Language = parseTextArrayExtension("language", extensions)
dc.Relation = parseTextArrayExtension("relation", extensions)
dc.Coverage = parseTextArrayExtension("coverage", extensions)
dc.Rights = parseTextArrayExtension("rights", extensions)
return dc
}

View File

@@ -0,0 +1,46 @@
package ext
// Extensions is the generic extension map for Feeds and Items.
// The first map is for the element namespace prefix (e.g., itunes).
// The second map is for the element name (e.g., author).
type Extensions map[string]map[string][]Extension
// Extension represents a single XML element that was in a non
// default namespace in a Feed or Item/Entry.
type Extension struct {
Name string `json:"name"`
Value string `json:"value"`
Attrs map[string]string `json:"attrs"`
Children map[string][]Extension `json:"children"`
}
func parseTextExtension(name string, extensions map[string][]Extension) (value string) {
if extensions == nil {
return
}
matches, ok := extensions[name]
if !ok || len(matches) == 0 {
return
}
match := matches[0]
return match.Value
}
func parseTextArrayExtension(name string, extensions map[string][]Extension) (values []string) {
if extensions == nil {
return
}
matches, ok := extensions[name]
if !ok || len(matches) == 0 {
return
}
values = []string{}
for _, m := range matches {
values = append(values, m.Value)
}
return
}

150
vendor/github.com/mmcdole/gofeed/extensions/itunes.go generated vendored Normal file
View File

@@ -0,0 +1,150 @@
package ext
// ITunesFeedExtension is a set of extension
// fields for RSS feeds.
type ITunesFeedExtension struct {
Author string `json:"author,omitempty"`
Block string `json:"block,omitempty"`
Categories []*ITunesCategory `json:"categories,omitempty"`
Explicit string `json:"explicit,omitempty"`
Keywords string `json:"keywords,omitempty"`
Owner *ITunesOwner `json:"owner,omitempty"`
Subtitle string `json:"subtitle,omitempty"`
Summary string `json:"summary,omitempty"`
Image string `json:"image,omitempty"`
Complete string `json:"complete,omitempty"`
NewFeedURL string `json:"newFeedUrl,omitempty"`
Type string `json:"type,omitempty"`
}
// ITunesItemExtension is a set of extension
// fields for RSS items.
type ITunesItemExtension struct {
Author string `json:"author,omitempty"`
Block string `json:"block,omitempty"`
Duration string `json:"duration,omitempty"`
Explicit string `json:"explicit,omitempty"`
Keywords string `json:"keywords,omitempty"`
Subtitle string `json:"subtitle,omitempty"`
Summary string `json:"summary,omitempty"`
Image string `json:"image,omitempty"`
IsClosedCaptioned string `json:"isClosedCaptioned,omitempty"`
Episode string `json:"episode,omitempty"`
Season string `json:"season,omitempty"`
Order string `json:"order,omitempty"`
EpisodeType string `json:"episodeType,omitempty"`
}
// ITunesCategory is a category element for itunes feeds.
type ITunesCategory struct {
Text string `json:"text,omitempty"`
Subcategory *ITunesCategory `json:"subcategory,omitempty"`
}
// ITunesOwner is the owner of a particular itunes feed.
type ITunesOwner struct {
Email string `json:"email,omitempty"`
Name string `json:"name,omitempty"`
}
// NewITunesFeedExtension creates an ITunesFeedExtension given an
// extension map for the "itunes" key.
func NewITunesFeedExtension(extensions map[string][]Extension) *ITunesFeedExtension {
feed := &ITunesFeedExtension{}
feed.Author = parseTextExtension("author", extensions)
feed.Block = parseTextExtension("block", extensions)
feed.Explicit = parseTextExtension("explicit", extensions)
feed.Keywords = parseTextExtension("keywords", extensions)
feed.Subtitle = parseTextExtension("subtitle", extensions)
feed.Summary = parseTextExtension("summary", extensions)
feed.Image = parseImage(extensions)
feed.Complete = parseTextExtension("complete", extensions)
feed.NewFeedURL = parseTextExtension("new-feed-url", extensions)
feed.Categories = parseCategories(extensions)
feed.Owner = parseOwner(extensions)
feed.Type = parseTextExtension("type", extensions)
return feed
}
// NewITunesItemExtension creates an ITunesItemExtension given an
// extension map for the "itunes" key.
func NewITunesItemExtension(extensions map[string][]Extension) *ITunesItemExtension {
entry := &ITunesItemExtension{}
entry.Author = parseTextExtension("author", extensions)
entry.Block = parseTextExtension("block", extensions)
entry.Duration = parseTextExtension("duration", extensions)
entry.Explicit = parseTextExtension("explicit", extensions)
entry.Subtitle = parseTextExtension("subtitle", extensions)
entry.Summary = parseTextExtension("summary", extensions)
entry.Keywords = parseTextExtension("keywords", extensions)
entry.Image = parseImage(extensions)
entry.IsClosedCaptioned = parseTextExtension("isClosedCaptioned", extensions)
entry.Episode = parseTextExtension("episode", extensions)
entry.Season = parseTextExtension("season", extensions)
entry.Order = parseTextExtension("order", extensions)
entry.EpisodeType = parseTextExtension("episodeType", extensions)
return entry
}
func parseImage(extensions map[string][]Extension) (image string) {
if extensions == nil {
return
}
matches, ok := extensions["image"]
if !ok || len(matches) == 0 {
return
}
image = matches[0].Attrs["href"]
return
}
func parseOwner(extensions map[string][]Extension) (owner *ITunesOwner) {
if extensions == nil {
return
}
matches, ok := extensions["owner"]
if !ok || len(matches) == 0 {
return
}
owner = &ITunesOwner{}
if name, ok := matches[0].Children["name"]; ok {
owner.Name = name[0].Value
}
if email, ok := matches[0].Children["email"]; ok {
owner.Email = email[0].Value
}
return
}
func parseCategories(extensions map[string][]Extension) (categories []*ITunesCategory) {
if extensions == nil {
return
}
matches, ok := extensions["category"]
if !ok || len(matches) == 0 {
return
}
categories = []*ITunesCategory{}
for _, cat := range matches {
c := &ITunesCategory{}
if text, ok := cat.Attrs["text"]; ok {
c.Text = text
}
if subs, ok := cat.Children["category"]; ok {
s := &ITunesCategory{}
if text, ok := subs[0].Attrs["text"]; ok {
s.Text = text
}
c.Subcategory = s
}
categories = append(categories, c)
}
return
}

84
vendor/github.com/mmcdole/gofeed/feed.go generated vendored Normal file
View File

@@ -0,0 +1,84 @@
package gofeed
import (
"encoding/json"
"time"
"github.com/mmcdole/gofeed/extensions"
)
// Feed is the universal Feed type that atom.Feed
// and rss.Feed gets translated to. It represents
// a web feed.
type Feed struct {
Title string `json:"title,omitempty"`
Description string `json:"description,omitempty"`
Link string `json:"link,omitempty"`
FeedLink string `json:"feedLink,omitempty"`
Updated string `json:"updated,omitempty"`
UpdatedParsed *time.Time `json:"updatedParsed,omitempty"`
Published string `json:"published,omitempty"`
PublishedParsed *time.Time `json:"publishedParsed,omitempty"`
Author *Person `json:"author,omitempty"`
Language string `json:"language,omitempty"`
Image *Image `json:"image,omitempty"`
Copyright string `json:"copyright,omitempty"`
Generator string `json:"generator,omitempty"`
Categories []string `json:"categories,omitempty"`
DublinCoreExt *ext.DublinCoreExtension `json:"dcExt,omitempty"`
ITunesExt *ext.ITunesFeedExtension `json:"itunesExt,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
Custom map[string]string `json:"custom,omitempty"`
Items []*Item `json:"items"`
FeedType string `json:"feedType"`
FeedVersion string `json:"feedVersion"`
}
func (f Feed) String() string {
json, _ := json.MarshalIndent(f, "", " ")
return string(json)
}
// Item is the universal Item type that atom.Entry
// and rss.Item gets translated to. It represents
// a single entry in a given feed.
type Item struct {
Title string `json:"title,omitempty"`
Description string `json:"description,omitempty"`
Content string `json:"content,omitempty"`
Link string `json:"link,omitempty"`
Updated string `json:"updated,omitempty"`
UpdatedParsed *time.Time `json:"updatedParsed,omitempty"`
Published string `json:"published,omitempty"`
PublishedParsed *time.Time `json:"publishedParsed,omitempty"`
Author *Person `json:"author,omitempty"`
GUID string `json:"guid,omitempty"`
Image *Image `json:"image,omitempty"`
Categories []string `json:"categories,omitempty"`
Enclosures []*Enclosure `json:"enclosures,omitempty"`
DublinCoreExt *ext.DublinCoreExtension `json:"dcExt,omitempty"`
ITunesExt *ext.ITunesItemExtension `json:"itunesExt,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
Custom map[string]string `json:"custom,omitempty"`
}
// Person is an individual specified in a feed
// (e.g. an author)
type Person struct {
Name string `json:"name,omitempty"`
Email string `json:"email,omitempty"`
}
// Image is an image that is the artwork for a given
// feed or item.
type Image struct {
URL string `json:"url,omitempty"`
Title string `json:"title,omitempty"`
}
// Enclosure is a file associated with a given Item.
type Enclosure struct {
URL string `json:"url,omitempty"`
Length string `json:"length,omitempty"`
Type string `json:"type,omitempty"`
}

12
vendor/github.com/mmcdole/gofeed/go.mod generated vendored Normal file
View File

@@ -0,0 +1,12 @@
module github.com/mmcdole/gofeed
require (
github.com/PuerkitoBio/goquery v1.5.0
github.com/codegangsta/cli v1.20.0
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/mmcdole/goxpp v0.0.0-20181012175147-0068e33feabf
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/stretchr/testify v1.2.2
golang.org/x/net v0.0.0-20181220203305-927f97764cc3
golang.org/x/text v0.3.0
)

20
vendor/github.com/mmcdole/gofeed/go.sum generated vendored Normal file
View File

@@ -0,0 +1,20 @@
github.com/PuerkitoBio/goquery v1.5.0 h1:uGvmFXOA73IKluu/F84Xd1tt/z07GYm8X49XKHP7EJk=
github.com/PuerkitoBio/goquery v1.5.0/go.mod h1:qD2PgZ9lccMbQlc7eEOjaeRlFQON7xY8kdmcsrnKqMg=
github.com/andybalholm/cascadia v1.0.0 h1:hOCXnnZ5A+3eVDX8pvgl4kofXv2ELss0bKcqRySc45o=
github.com/andybalholm/cascadia v1.0.0/go.mod h1:GsXiBklL0woXo1j/WYWtSYYC4ouU9PqHO0sqidkEA4Y=
github.com/codegangsta/cli v1.20.0 h1:iX1FXEgwzd5+XN6wk5cVHOGQj6Q3Dcp20lUeS4lHNTw=
github.com/codegangsta/cli v1.20.0/go.mod h1:/qJNoX69yVSKu5o4jLyXAENLRyk1uhi7zkbQ3slBdOA=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/mmcdole/goxpp v0.0.0-20181012175147-0068e33feabf h1:sWGE2v+hO0Nd4yFU/S/mDBM5plIU8v/Qhfz41hkDIAI=
github.com/mmcdole/goxpp v0.0.0-20181012175147-0068e33feabf/go.mod h1:pasqhqstspkosTneA62Nc+2p9SOBBYAPbnmRRWPQ0V8=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/stretchr/testify v1.2.2 h1:bSDNvY7ZPG5RlJ8otE/7V6gMiyenm9RtJ7IUVIAoJ1w=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
golang.org/x/net v0.0.0-20180218175443-cbe0f9307d01/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181220203305-927f97764cc3 h1:eH6Eip3UpmR+yM/qI9Ijluzb1bNv/cAU/n+6l8tRSis=
golang.org/x/net v0.0.0-20181220203305-927f97764cc3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=

View File

@@ -0,0 +1,19 @@
package shared
import (
"io"
"golang.org/x/net/html/charset"
)
func NewReaderLabel(label string, input io.Reader) (io.Reader, error) {
conv, err := charset.NewReaderLabel(label, input)
if err != nil {
return nil, err
}
// Wrap the charset decoder reader with a XML sanitizer
//clean := NewXMLSanitizerReader(conv)
return conv, nil
}

View File

@@ -0,0 +1,219 @@
package shared
import (
"fmt"
"strings"
"time"
)
// DateFormats taken from github.com/mjibson/goread
var dateFormats = []string{
time.RFC822, // RSS
time.RFC822Z, // RSS
time.RFC3339, // Atom
time.UnixDate,
time.RubyDate,
time.RFC850,
time.RFC1123Z,
time.RFC1123,
time.ANSIC,
"Mon, January 2 2006 15:04:05 -0700",
"Mon, Jan 2 2006 15:04:05 -700",
"Mon, Jan 2 2006 15:04:05 -0700",
"Mon Jan 2 15:04 2006",
"Mon Jan 02, 2006 3:04 pm",
"Mon Jan 02 2006 15:04:05 -0700",
"Monday, January 2, 2006 03:04 PM",
"Monday, January 2, 2006",
"Monday, January 02, 2006",
"Monday, 2 January 2006 15:04:05 -0700",
"Monday, 2 Jan 2006 15:04:05 -0700",
"Monday, 02 January 2006 15:04:05 -0700",
"Monday, 02 January 2006 15:04:05",
"Mon, 2 January 2006, 15:04 -0700",
"Mon, 2 January 2006 15:04:05 -0700",
"Mon, 2 January 2006",
"Mon, 2 Jan 2006 3:04:05 PM -0700",
"Mon, 2 Jan 2006 15:4:5 -0700 GMT",
"Mon, 2, Jan 2006 15:4",
"Mon, 2 Jan 2006, 15:04 -0700",
"Mon, 2 Jan 2006 15:04 -0700",
"Mon, 2 Jan 2006 15:04:05 UT",
"Mon, 2 Jan 2006 15:04:05 -0700 MST",
"Mon, 2 Jan 2006 15:04:05-0700",
"Mon, 2 Jan 2006 15:04:05 -0700",
"Mon, 2 Jan 2006 15:04:05",
"Mon, 2 Jan 2006 15:04",
"Mon,2 Jan 2006",
"Mon, 2 Jan 2006",
"Mon, 2 Jan 06 15:04:05 -0700",
"Mon, 2006-01-02 15:04",
"Mon, 02 January 2006",
"Mon, 02 Jan 2006 15 -0700",
"Mon, 02 Jan 2006 15:04 -0700",
"Mon, 02 Jan 2006 15:04:05 Z",
"Mon, 02 Jan 2006 15:04:05 UT",
"Mon, 02 Jan 2006 15:04:05 MST-07:00",
"Mon, 02 Jan 2006 15:04:05 MST -0700",
"Mon, 02 Jan 2006 15:04:05 GMT-0700",
"Mon,02 Jan 2006 15:04:05 -0700",
"Mon, 02 Jan 2006 15:04:05 -0700",
"Mon, 02 Jan 2006 15:04:05 -07:00",
"Mon, 02 Jan 2006 15:04:05 --0700",
"Mon 02 Jan 2006 15:04:05 -0700",
"Mon, 02 Jan 2006 15:04:05 -07",
"Mon, 02 Jan 2006 15:04:05 00",
"Mon, 02 Jan 2006 15:04:05",
"Mon, 02 Jan 2006",
"January 2, 2006 3:04 PM",
"January 2, 2006, 3:04 p.m.",
"January 2, 2006 15:04:05",
"January 2, 2006 03:04 PM",
"January 2, 2006",
"January 02, 2006 15:04",
"January 02, 2006 03:04 PM",
"January 02, 2006",
"Jan 2, 2006 3:04:05 PM",
"Jan 2, 2006",
"Jan 02 2006 03:04:05PM",
"Jan 02, 2006",
"6/1/2 15:04",
"6-1-2 15:04",
"2 January 2006 15:04:05 -0700",
"2 January 2006",
"2 Jan 2006 15:04:05 Z",
"2 Jan 2006 15:04:05 -0700",
"2 Jan 2006",
"2.1.2006 15:04:05",
"2/1/2006",
"2-1-2006",
"2006 January 02",
"2006-1-2T15:04:05Z",
"2006-1-2 15:04:05",
"2006-1-2",
"2006-1-02T15:04:05Z",
"2006-01-02T15:04Z",
"2006-01-02T15:04-07:00",
"2006-01-02T15:04:05Z",
"2006-01-02T15:04:05-07:00:00",
"2006-01-02T15:04:05:-0700",
"2006-01-02T15:04:05-0700",
"2006-01-02T15:04:05-07:00",
"2006-01-02T15:04:05 -0700",
"2006-01-02T15:04:05:00",
"2006-01-02T15:04:05",
"2006-01-02 at 15:04:05",
"2006-01-02 15:04:05Z",
"2006-01-02 15:04:05-0700",
"2006-01-02 15:04:05-07:00",
"2006-01-02 15:04:05 -0700",
"2006-01-02 15:04",
"2006-01-02 00:00:00.0 15:04:05.0 -0700",
"2006/01/02",
"2006-01-02",
"15:04 02.01.2006 -0700",
"1/2/2006 3:04:05 PM",
"1/2/2006",
"06/1/2 15:04",
"06-1-2 15:04",
"02 Monday, Jan 2006 15:04",
"02 Jan 2006 15:04:05 UT",
"02 Jan 2006 15:04:05 -0700",
"02 Jan 2006 15:04:05",
"02 Jan 2006",
"02.01.2006 15:04:05",
"02/01/2006 15:04:05",
"02.01.2006 15:04",
"02/01/2006 - 15:04",
"02.01.2006 -0700",
"02/01/2006",
"02-01-2006",
"01/02/2006 3:04 PM",
"01/02/2006 - 15:04",
"01/02/2006",
"01-02-2006",
}
// Named zone cannot be consistently loaded, so handle separately
var dateFormatsWithNamedZone = []string{
"Mon, January 02, 2006, 15:04:05 MST",
"Mon, January 02, 2006 15:04:05 MST",
"Mon, Jan 2, 2006 15:04 MST",
"Mon, Jan 2 2006 15:04 MST",
"Mon, Jan 2, 2006 15:04:05 MST",
"Mon Jan 2 15:04:05 2006 MST",
"Mon, Jan 02,2006 15:04:05 MST",
"Monday, January 2, 2006 15:04:05 MST",
"Monday, 2 January 2006 15:04:05 MST",
"Monday, 2 Jan 2006 15:04:05 MST",
"Monday, 02 January 2006 15:04:05 MST",
"Mon, 2 January 2006 15:04 MST",
"Mon, 2 January 2006, 15:04:05 MST",
"Mon, 2 January 2006 15:04:05 MST",
"Mon, 2 Jan 2006 15:4:5 MST",
"Mon, 2 Jan 2006 15:04 MST",
"Mon, 2 Jan 2006 15:04:05MST",
"Mon, 2 Jan 2006 15:04:05 MST",
"Mon 2 Jan 2006 15:04:05 MST",
"mon,2 Jan 2006 15:04:05 MST",
"Mon, 2 Jan 15:04:05 MST",
"Mon, 2 Jan 06 15:04:05 MST",
"Mon,02 January 2006 14:04:05 MST",
"Mon, 02 Jan 2006 3:04:05 PM MST",
"Mon,02 Jan 2006 15:04 MST",
"Mon, 02 Jan 2006 15:04 MST",
"Mon, 02 Jan 2006, 15:04:05 MST",
"Mon, 02 Jan 2006 15:04:05MST",
"Mon, 02 Jan 2006 15:04:05 MST",
"Mon , 02 Jan 2006 15:04:05 MST",
"Mon, 02 Jan 06 15:04:05 MST",
"January 2, 2006 15:04:05 MST",
"January 02, 2006 15:04:05 MST",
"Jan 2, 2006 3:04:05 PM MST",
"Jan 2, 2006 15:04:05 MST",
"2 January 2006 15:04:05 MST",
"2 Jan 2006 15:04:05 MST",
"2006-01-02 15:04:05 MST",
"1/2/2006 3:04:05 PM MST",
"1/2/2006 15:04:05 MST",
"02 Jan 2006 15:04 MST",
"02 Jan 2006 15:04:05 MST",
"02/01/2006 15:04 MST",
"02-01-2006 15:04:05 MST",
"01/02/2006 15:04:05 MST",
}
// ParseDate parses a given date string using a large
// list of commonly found feed date formats.
func ParseDate(ds string) (t time.Time, err error) {
d := strings.TrimSpace(ds)
if d == "" {
return t, fmt.Errorf("Date string is empty")
}
for _, f := range dateFormats {
if t, err = time.Parse(f, d); err == nil {
return
}
}
for _, f := range dateFormatsWithNamedZone {
t, err = time.Parse(f, d)
if err != nil {
continue
}
// This is a format match! Now try to load the timezone name
loc, err := time.LoadLocation(t.Location().String())
if err != nil {
// We couldn't load the TZ name. Just use UTC instead...
return t, nil
}
if t, err = time.ParseInLocation(f, ds, loc); err == nil {
return t, nil
}
// This should not be reachable
}
err = fmt.Errorf("Failed to parse date: %s", ds)
return
}

View File

@@ -0,0 +1,176 @@
package shared
import (
"strings"
"github.com/mmcdole/gofeed/extensions"
"github.com/mmcdole/goxpp"
)
// IsExtension returns whether or not the current
// XML element is an extension element (if it has a
// non empty prefix)
func IsExtension(p *xpp.XMLPullParser) bool {
space := strings.TrimSpace(p.Space)
if prefix, ok := p.Spaces[space]; ok {
return !(prefix == "" || prefix == "rss" || prefix == "rdf" || prefix == "content")
}
return p.Space != ""
}
// ParseExtension parses the current element of the
// XMLPullParser as an extension element and updates
// the extension map
func ParseExtension(fe ext.Extensions, p *xpp.XMLPullParser) (ext.Extensions, error) {
prefix := prefixForNamespace(p.Space, p)
result, err := parseExtensionElement(p)
if err != nil {
return nil, err
}
// Ensure the extension prefix map exists
if _, ok := fe[prefix]; !ok {
fe[prefix] = map[string][]ext.Extension{}
}
// Ensure the extension element slice exists
if _, ok := fe[prefix][p.Name]; !ok {
fe[prefix][p.Name] = []ext.Extension{}
}
fe[prefix][p.Name] = append(fe[prefix][p.Name], result)
return fe, nil
}
func parseExtensionElement(p *xpp.XMLPullParser) (e ext.Extension, err error) {
if err = p.Expect(xpp.StartTag, "*"); err != nil {
return e, err
}
e.Name = p.Name
e.Children = map[string][]ext.Extension{}
e.Attrs = map[string]string{}
for _, attr := range p.Attrs {
// TODO: Alright that we are stripping
// namespace information from attributes ?
e.Attrs[attr.Name.Local] = attr.Value
}
for {
tok, err := p.Next()
if err != nil {
return e, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
child, err := parseExtensionElement(p)
if err != nil {
return e, err
}
if _, ok := e.Children[child.Name]; !ok {
e.Children[child.Name] = []ext.Extension{}
}
e.Children[child.Name] = append(e.Children[child.Name], child)
} else if tok == xpp.Text {
e.Value += p.Text
}
}
e.Value = strings.TrimSpace(e.Value)
if err = p.Expect(xpp.EndTag, e.Name); err != nil {
return e, err
}
return e, nil
}
func prefixForNamespace(space string, p *xpp.XMLPullParser) string {
// First we check if the global namespace map
// contains an entry for this namespace/prefix.
// This way we can use the canonical prefix for this
// ns instead of the one defined in the feed.
if prefix, ok := canonicalNamespaces[space]; ok {
return prefix
}
// Next we check if the feed itself defined this
// this namespace and return it if we have a result.
if prefix, ok := p.Spaces[space]; ok {
return prefix
}
// Lastly, any namespace which is not defined in the
// the feed will be the prefix itself when using Go's
// xml.Decoder.Token() method.
return space
}
// Namespaces taken from github.com/kurtmckee/feedparser
// These are used for determining canonical name space prefixes
// for many of the popular RSS/Atom extensions.
//
// These canonical prefixes override any prefixes used in the feed itself.
var canonicalNamespaces = map[string]string{
"http://webns.net/mvcb/": "admin",
"http://purl.org/rss/1.0/modules/aggregation/": "ag",
"http://purl.org/rss/1.0/modules/annotate/": "annotate",
"http://media.tangent.org/rss/1.0/": "audio",
"http://backend.userland.com/blogChannelModule": "blogChannel",
"http://creativecommons.org/ns#license": "cc",
"http://web.resource.org/cc/": "cc",
"http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html": "creativeCommons",
"http://backend.userland.com/creativeCommonsRssModule": "creativeCommons",
"http://purl.org/rss/1.0/modules/company": "co",
"http://purl.org/rss/1.0/modules/content/": "content",
"http://my.theinfo.org/changed/1.0/rss/": "cp",
"http://purl.org/dc/elements/1.1/": "dc",
"http://purl.org/dc/terms/": "dcterms",
"http://purl.org/rss/1.0/modules/email/": "email",
"http://purl.org/rss/1.0/modules/event/": "ev",
"http://rssnamespace.org/feedburner/ext/1.0": "feedburner",
"http://freshmeat.net/rss/fm/": "fm",
"http://xmlns.com/foaf/0.1/": "foaf",
"http://www.w3.org/2003/01/geo/wgs84_pos#": "geo",
"http://www.georss.org/georss": "georss",
"http://www.opengis.net/gml": "gml",
"http://postneo.com/icbm/": "icbm",
"http://purl.org/rss/1.0/modules/image/": "image",
"http://www.itunes.com/DTDs/PodCast-1.0.dtd": "itunes",
"http://example.com/DTDs/PodCast-1.0.dtd": "itunes",
"http://purl.org/rss/1.0/modules/link/": "l",
"http://search.yahoo.com/mrss": "media",
"http://search.yahoo.com/mrss/": "media",
"http://madskills.com/public/xml/rss/module/pingback/": "pingback",
"http://prismstandard.org/namespaces/1.2/basic/": "prism",
"http://www.w3.org/1999/02/22-rdf-syntax-ns#": "rdf",
"http://www.w3.org/2000/01/rdf-schema#": "rdfs",
"http://purl.org/rss/1.0/modules/reference/": "ref",
"http://purl.org/rss/1.0/modules/richequiv/": "reqv",
"http://purl.org/rss/1.0/modules/search/": "search",
"http://purl.org/rss/1.0/modules/slash/": "slash",
"http://schemas.xmlsoap.org/soap/envelope/": "soap",
"http://purl.org/rss/1.0/modules/servicestatus/": "ss",
"http://hacks.benhammersley.com/rss/streaming/": "str",
"http://purl.org/rss/1.0/modules/subscription/": "sub",
"http://purl.org/rss/1.0/modules/syndication/": "sy",
"http://schemas.pocketsoap.com/rss/myDescModule/": "szf",
"http://purl.org/rss/1.0/modules/taxonomy/": "taxo",
"http://purl.org/rss/1.0/modules/threading/": "thr",
"http://purl.org/rss/1.0/modules/textinput/": "ti",
"http://madskills.com/public/xml/rss/module/trackback/": "trackback",
"http://wellformedweb.org/commentAPI/": "wfw",
"http://purl.org/rss/1.0/modules/wiki/": "wiki",
"http://www.w3.org/1999/xhtml": "xhtml",
"http://www.w3.org/1999/xlink": "xlink",
"http://www.w3.org/XML/1998/namespace": "xml",
"http://podlove.org/simple-chapters": "psc",
}

View File

@@ -0,0 +1,194 @@
package shared
import (
"bytes"
"errors"
"fmt"
"regexp"
"strconv"
"strings"
xpp "github.com/mmcdole/goxpp"
)
var (
emailNameRgx = regexp.MustCompile(`^([^@]+@[^\s]+)\s+\(([^@]+)\)$`)
nameEmailRgx = regexp.MustCompile(`^([^@]+)\s+\(([^@]+@[^)]+)\)$`)
nameOnlyRgx = regexp.MustCompile(`^([^@()]+)$`)
emailOnlyRgx = regexp.MustCompile(`^([^@()]+@[^@()]+)$`)
TruncatedEntity = errors.New("truncated entity")
InvalidNumericReference = errors.New("invalid numeric reference")
)
const CDATA_START = "<![CDATA["
const CDATA_END = "]]>"
// ParseText is a helper function for parsing the text
// from the current element of the XMLPullParser.
// This function can handle parsing naked XML text from
// an element.
func ParseText(p *xpp.XMLPullParser) (string, error) {
var text struct {
Type string `xml:"type,attr"`
InnerXML string `xml:",innerxml"`
}
err := p.DecodeElement(&text)
if err != nil {
return "", err
}
result := text.InnerXML
result = strings.TrimSpace(result)
if strings.Contains(result, CDATA_START) {
return StripCDATA(result), nil
}
return DecodeEntities(result)
}
// StripCDATA removes CDATA tags from the string
// content outside of CDATA tags is passed via DecodeEntities
func StripCDATA(str string) string {
buf := bytes.NewBuffer([]byte{})
curr := 0
for curr < len(str) {
start := indexAt(str, CDATA_START, curr)
if start == -1 {
dec, _ := DecodeEntities(str[curr:])
buf.Write([]byte(dec))
return buf.String()
}
end := indexAt(str, CDATA_END, start)
if end == -1 {
dec, _ := DecodeEntities(str[curr:])
buf.Write([]byte(dec))
return buf.String()
}
buf.Write([]byte(str[start+len(CDATA_START) : end]))
curr = curr + end + len(CDATA_END)
}
return buf.String()
}
// DecodeEntities decodes escaped XML entities
// in a string and returns the unescaped string
func DecodeEntities(str string) (string, error) {
data := []byte(str)
buf := bytes.NewBuffer([]byte{})
for len(data) > 0 {
// Find the next entity
idx := bytes.IndexByte(data, '&')
if idx == -1 {
buf.Write(data)
break
}
// Write and skip everything before it
buf.Write(data[:idx])
data = data[idx+1:]
if len(data) == 0 {
return "", TruncatedEntity
}
// Find the end of the entity
end := bytes.IndexByte(data, ';')
if end == -1 {
return "", TruncatedEntity
}
if data[0] == '#' {
// Numerical character reference
var str string
base := 10
if len(data) > 1 && data[1] == 'x' {
str = string(data[2:end])
base = 16
} else {
str = string(data[1:end])
}
i, err := strconv.ParseUint(str, base, 32)
if err != nil {
return "", InvalidNumericReference
}
buf.WriteRune(rune(i))
} else {
// Predefined entity
name := string(data[:end])
var c byte
switch name {
case "lt":
c = '<'
case "gt":
c = '>'
case "quot":
c = '"'
case "apos":
c = '\''
case "amp":
c = '&'
default:
return "", fmt.Errorf("unknown predefined "+
"entity &%s;", name)
}
buf.WriteByte(c)
}
// Skip the entity
data = data[end+1:]
}
return buf.String(), nil
}
// ParseNameAddress parses name/email strings commonly
// found in RSS feeds of the format "Example Name (example@site.com)"
// and other variations of this format.
func ParseNameAddress(nameAddressText string) (name string, address string) {
if nameAddressText == "" {
return
}
if emailNameRgx.MatchString(nameAddressText) {
result := emailNameRgx.FindStringSubmatch(nameAddressText)
address = result[1]
name = result[2]
} else if nameEmailRgx.MatchString(nameAddressText) {
result := nameEmailRgx.FindStringSubmatch(nameAddressText)
name = result[1]
address = result[2]
} else if nameOnlyRgx.MatchString(nameAddressText) {
result := nameOnlyRgx.FindStringSubmatch(nameAddressText)
name = result[1]
} else if emailOnlyRgx.MatchString(nameAddressText) {
result := emailOnlyRgx.FindStringSubmatch(nameAddressText)
address = result[1]
}
return
}
func indexAt(str, substr string, start int) int {
idx := strings.Index(str[start:], substr)
if idx > -1 {
idx += start
}
return idx
}

View File

@@ -0,0 +1,258 @@
package shared
import (
"bytes"
"fmt"
"golang.org/x/net/html"
"net/url"
"strings"
"github.com/mmcdole/goxpp"
)
var (
// HTML attributes which contain URIs
// https://pythonhosted.org/feedparser/resolving-relative-links.html
// To catch every possible URI attribute is non-trivial:
// https://stackoverflow.com/questions/2725156/complete-list-of-html-tag-attributes-which-have-a-url-value
htmlURIAttrs = map[string]bool{
"action": true,
"background": true,
"cite": true,
"codebase": true,
"data": true,
"href": true,
"poster": true,
"profile": true,
"scheme": true,
"src": true,
"uri": true,
"usemap": true,
}
)
type urlStack []*url.URL
func (s *urlStack) push(u *url.URL) {
*s = append([]*url.URL{u}, *s...)
}
func (s *urlStack) pop() *url.URL {
if s == nil || len(*s) == 0 {
return nil
}
var top *url.URL
top, *s = (*s)[0], (*s)[1:]
return top
}
func (s *urlStack) top() *url.URL {
if s == nil || len(*s) == 0 {
return nil
}
return (*s)[0]
}
type XMLBase struct {
stack urlStack
URIAttrs map[string]bool
}
// FindRoot iterates through the tokens of an xml document until
// it encounters its first StartTag event. It returns an error
// if it reaches EndDocument before finding a tag.
func (b *XMLBase) FindRoot(p *xpp.XMLPullParser) (event xpp.XMLEventType, err error) {
for {
event, err = b.NextTag(p)
if err != nil {
return event, err
}
if event == xpp.StartTag {
break
}
if event == xpp.EndDocument {
return event, fmt.Errorf("Failed to find root node before document end.")
}
}
return
}
// XMLBase.NextTag iterates through the tokens until it reaches a StartTag or
// EndTag It maintains the urlStack upon encountering StartTag and EndTags, so
// that the top of the stack (accessible through the CurrentBase() and
// CurrentBaseURL() methods) is the absolute base URI by which relative URIs
// should be resolved.
//
// NextTag is similar to goxpp's NextTag method except it wont throw an error
// if the next immediate token isnt a Start/EndTag. Instead, it will continue
// to consume tokens until it hits a Start/EndTag or EndDocument.
func (b *XMLBase) NextTag(p *xpp.XMLPullParser) (event xpp.XMLEventType, err error) {
for {
if p.Event == xpp.EndTag {
// Pop xml:base after each end tag
b.pop()
}
event, err = p.Next()
if err != nil {
return event, err
}
if event == xpp.EndTag {
break
}
if event == xpp.StartTag {
base := parseBase(p)
err = b.push(base)
if err != nil {
return
}
err = b.resolveAttrs(p)
if err != nil {
return
}
break
}
if event == xpp.EndDocument {
return event, fmt.Errorf("Failed to find NextTag before reaching the end of the document.")
}
}
return
}
func parseBase(p *xpp.XMLPullParser) string {
xmlURI := "http://www.w3.org/XML/1998/namespace"
for _, attr := range p.Attrs {
if attr.Name.Local == "base" && attr.Name.Space == xmlURI {
return attr.Value
}
}
return ""
}
func (b *XMLBase) push(base string) error {
newURL, err := url.Parse(base)
if err != nil {
return err
}
topURL := b.CurrentBaseURL()
if topURL != nil {
newURL = topURL.ResolveReference(newURL)
}
b.stack.push(newURL)
return nil
}
// returns the popped base URL
func (b *XMLBase) pop() string {
url := b.stack.pop()
if url != nil {
return url.String()
}
return ""
}
func (b *XMLBase) CurrentBaseURL() *url.URL {
return b.stack.top()
}
func (b *XMLBase) CurrentBase() string {
if url := b.CurrentBaseURL(); url != nil {
return url.String()
}
return ""
}
// resolve the given string as a URL relative to current base
func (b *XMLBase) ResolveURL(u string) (string, error) {
if b.CurrentBase() == "" {
return u, nil
}
relURL, err := url.Parse(u)
if err != nil {
return u, err
}
curr := b.CurrentBaseURL()
if curr.Path != "" && u != "" && curr.Path[len(curr.Path)-1] != '/' {
// There's no reason someone would use a path in xml:base if they
// didn't mean for it to be a directory
curr.Path = curr.Path + "/"
}
absURL := b.CurrentBaseURL().ResolveReference(relURL)
return absURL.String(), nil
}
// resolve relative URI attributes according to xml:base
func (b *XMLBase) resolveAttrs(p *xpp.XMLPullParser) error {
for i, attr := range p.Attrs {
lowerName := strings.ToLower(attr.Name.Local)
if b.URIAttrs[lowerName] {
absURL, err := b.ResolveURL(attr.Value)
if err != nil {
return err
}
p.Attrs[i].Value = absURL
}
}
return nil
}
// Transforms html by resolving any relative URIs in attributes
// if an error occurs during parsing or serialization, then the original string
// is returned along with the error.
func (b *XMLBase) ResolveHTML(relHTML string) (string, error) {
if b.CurrentBase() == "" {
return relHTML, nil
}
htmlReader := strings.NewReader(relHTML)
doc, err := html.Parse(htmlReader)
if err != nil {
return relHTML, err
}
var visit func(*html.Node)
// recursively traverse HTML resolving any relative URIs in attributes
visit = func(n *html.Node) {
if n.Type == html.ElementNode {
for i, a := range n.Attr {
if htmlURIAttrs[a.Key] {
absVal, err := b.ResolveURL(a.Val)
if err == nil {
n.Attr[i].Val = absVal
}
break
}
}
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
visit(c)
}
}
visit(doc)
var w bytes.Buffer
err = html.Render(&w, doc)
if err != nil {
return relHTML, err
}
// html.Render() always writes a complete html5 document, so strip the html
// and body tags
absHTML := w.String()
absHTML = strings.TrimPrefix(absHTML, "<html><head></head><body>")
absHTML = strings.TrimSuffix(absHTML, "</body></html>")
return absHTML, err
}

View File

@@ -0,0 +1,23 @@
package shared
import (
"io"
"golang.org/x/text/transform"
)
// NewXMLSanitizerReader creates an io.Reader that
// wraps another io.Reader and removes illegal xml
// characters from the io stream.
func NewXMLSanitizerReader(xml io.Reader) io.Reader {
isIllegal := func(r rune) bool {
return !(r == 0x09 ||
r == 0x0A ||
r == 0x0D ||
r >= 0x20 && r <= 0xDF77 ||
r >= 0xE000 && r <= 0xFFFD ||
r >= 0x10000 && r <= 0x10FFFF)
}
t := transform.Chain(transform.RemoveFunc(isIllegal))
return transform.NewReader(xml, t)
}

165
vendor/github.com/mmcdole/gofeed/parser.go generated vendored Normal file
View File

@@ -0,0 +1,165 @@
package gofeed
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"net/http"
"strings"
"github.com/mmcdole/gofeed/atom"
"github.com/mmcdole/gofeed/rss"
)
// ErrFeedTypeNotDetected is returned when the detection system can not figure
// out the Feed format
var ErrFeedTypeNotDetected = errors.New("Failed to detect feed type")
// HTTPError represents an HTTP error returned by a server.
type HTTPError struct {
StatusCode int
Status string
}
func (err HTTPError) Error() string {
return fmt.Sprintf("http error: %s", err.Status)
}
// Parser is a universal feed parser that detects
// a given feed type, parsers it, and translates it
// to the universal feed type.
type Parser struct {
AtomTranslator Translator
RSSTranslator Translator
Client *http.Client
rp *rss.Parser
ap *atom.Parser
}
// NewParser creates a universal feed parser.
func NewParser() *Parser {
fp := Parser{
rp: &rss.Parser{},
ap: &atom.Parser{},
}
return &fp
}
// Parse parses a RSS or Atom feed into
// the universal gofeed.Feed. It takes an
// io.Reader which should return the xml content.
func (f *Parser) Parse(feed io.Reader) (*Feed, error) {
// Wrap the feed io.Reader in a io.TeeReader
// so we can capture all the bytes read by the
// DetectFeedType function and construct a new
// reader with those bytes intact for when we
// attempt to parse the feeds.
var buf bytes.Buffer
tee := io.TeeReader(feed, &buf)
feedType := DetectFeedType(tee)
// Glue the read bytes from the detect function
// back into a new reader
r := io.MultiReader(&buf, feed)
switch feedType {
case FeedTypeAtom:
return f.parseAtomFeed(r)
case FeedTypeRSS:
return f.parseRSSFeed(r)
}
return nil, ErrFeedTypeNotDetected
}
// ParseURL fetches the contents of a given url and
// attempts to parse the response into the universal feed type.
func (f *Parser) ParseURL(feedURL string) (feed *Feed, err error) {
return f.ParseURLWithContext(feedURL, context.Background())
}
// ParseURLWithContext fetches contents of a given url and
// attempts to parse the response into the universal feed type.
// Request could be canceled or timeout via given context
func (f *Parser) ParseURLWithContext(feedURL string, ctx context.Context) (feed *Feed, err error) {
client := f.httpClient()
req, err := http.NewRequest("GET", feedURL, nil)
if err != nil {
return nil, err
}
req = req.WithContext(ctx)
req.Header.Set("User-Agent", "Gofeed/1.0")
resp, err := client.Do(req)
if err != nil {
return nil, err
}
if resp != nil {
defer func() {
ce := resp.Body.Close()
if ce != nil {
err = ce
}
}()
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return nil, HTTPError{
StatusCode: resp.StatusCode,
Status: resp.Status,
}
}
return f.Parse(resp.Body)
}
// ParseString parses a feed XML string and into the
// universal feed type.
func (f *Parser) ParseString(feed string) (*Feed, error) {
return f.Parse(strings.NewReader(feed))
}
func (f *Parser) parseAtomFeed(feed io.Reader) (*Feed, error) {
af, err := f.ap.Parse(feed)
if err != nil {
return nil, err
}
return f.atomTrans().Translate(af)
}
func (f *Parser) parseRSSFeed(feed io.Reader) (*Feed, error) {
rf, err := f.rp.Parse(feed)
if err != nil {
return nil, err
}
return f.rssTrans().Translate(rf)
}
func (f *Parser) atomTrans() Translator {
if f.AtomTranslator != nil {
return f.AtomTranslator
}
f.AtomTranslator = &DefaultAtomTranslator{}
return f.AtomTranslator
}
func (f *Parser) rssTrans() Translator {
if f.RSSTranslator != nil {
return f.RSSTranslator
}
f.RSSTranslator = &DefaultRSSTranslator{}
return f.RSSTranslator
}
func (f *Parser) httpClient() *http.Client {
if f.Client != nil {
return f.Client
}
f.Client = &http.Client{}
return f.Client
}

120
vendor/github.com/mmcdole/gofeed/rss/feed.go generated vendored Normal file
View File

@@ -0,0 +1,120 @@
package rss
import (
"encoding/json"
"time"
"github.com/mmcdole/gofeed/extensions"
)
// Feed is an RSS Feed
type Feed struct {
Title string `json:"title,omitempty"`
Link string `json:"link,omitempty"`
Description string `json:"description,omitempty"`
Language string `json:"language,omitempty"`
Copyright string `json:"copyright,omitempty"`
ManagingEditor string `json:"managingEditor,omitempty"`
WebMaster string `json:"webMaster,omitempty"`
PubDate string `json:"pubDate,omitempty"`
PubDateParsed *time.Time `json:"pubDateParsed,omitempty"`
LastBuildDate string `json:"lastBuildDate,omitempty"`
LastBuildDateParsed *time.Time `json:"lastBuildDateParsed,omitempty"`
Categories []*Category `json:"categories,omitempty"`
Generator string `json:"generator,omitempty"`
Docs string `json:"docs,omitempty"`
TTL string `json:"ttl,omitempty"`
Image *Image `json:"image,omitempty"`
Rating string `json:"rating,omitempty"`
SkipHours []string `json:"skipHours,omitempty"`
SkipDays []string `json:"skipDays,omitempty"`
Cloud *Cloud `json:"cloud,omitempty"`
TextInput *TextInput `json:"textInput,omitempty"`
DublinCoreExt *ext.DublinCoreExtension `json:"dcExt,omitempty"`
ITunesExt *ext.ITunesFeedExtension `json:"itunesExt,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
Items []*Item `json:"items"`
Version string `json:"version"`
}
func (f Feed) String() string {
json, _ := json.MarshalIndent(f, "", " ")
return string(json)
}
// Item is an RSS Item
type Item struct {
Title string `json:"title,omitempty"`
Link string `json:"link,omitempty"`
Description string `json:"description,omitempty"`
Content string `json:"content,omitempty"`
Author string `json:"author,omitempty"`
Categories []*Category `json:"categories,omitempty"`
Comments string `json:"comments,omitempty"`
Enclosure *Enclosure `json:"enclosure,omitempty"`
GUID *GUID `json:"guid,omitempty"`
PubDate string `json:"pubDate,omitempty"`
PubDateParsed *time.Time `json:"pubDateParsed,omitempty"`
Source *Source `json:"source,omitempty"`
DublinCoreExt *ext.DublinCoreExtension `json:"dcExt,omitempty"`
ITunesExt *ext.ITunesItemExtension `json:"itunesExt,omitempty"`
Extensions ext.Extensions `json:"extensions,omitempty"`
}
// Image is an image that represents the feed
type Image struct {
URL string `json:"url,omitempty"`
Link string `json:"link,omitempty"`
Title string `json:"title,omitempty"`
Width string `json:"width,omitempty"`
Height string `json:"height,omitempty"`
Description string `json:"description,omitempty"`
}
// Enclosure is a media object that is attached to
// the item
type Enclosure struct {
URL string `json:"url,omitempty"`
Length string `json:"length,omitempty"`
Type string `json:"type,omitempty"`
}
// GUID is a unique identifier for an item
type GUID struct {
Value string `json:"value,omitempty"`
IsPermalink string `json:"isPermalink,omitempty"`
}
// Source contains feed information for another
// feed if a given item came from that feed
type Source struct {
Title string `json:"title,omitempty"`
URL string `json:"url,omitempty"`
}
// Category is category metadata for Feeds and Entries
type Category struct {
Domain string `json:"domain,omitempty"`
Value string `json:"value,omitempty"`
}
// TextInput specifies a text input box that
// can be displayed with the channel
type TextInput struct {
Title string `json:"title,omitempty"`
Description string `json:"description,omitempty"`
Name string `json:"name,omitempty"`
Link string `json:"link,omitempty"`
}
// Cloud allows processes to register with a
// cloud to be notified of updates to the channel,
// implementing a lightweight publish-subscribe protocol
// for RSS feeds
type Cloud struct {
Domain string `json:"domain,omitempty"`
Port string `json:"port,omitempty"`
Path string `json:"path,omitempty"`
RegisterProcedure string `json:"registerProcedure,omitempty"`
Protocol string `json:"protocol,omitempty"`
}

770
vendor/github.com/mmcdole/gofeed/rss/parser.go generated vendored Normal file
View File

@@ -0,0 +1,770 @@
package rss
import (
"fmt"
"io"
"strings"
ext "github.com/mmcdole/gofeed/extensions"
"github.com/mmcdole/gofeed/internal/shared"
xpp "github.com/mmcdole/goxpp"
)
// Parser is a RSS Parser
type Parser struct {
base *shared.XMLBase
}
// Parse parses an xml feed into an rss.Feed
func (rp *Parser) Parse(feed io.Reader) (*Feed, error) {
p := xpp.NewXMLPullParser(feed, false, shared.NewReaderLabel)
rp.base = &shared.XMLBase{}
_, err := rp.base.FindRoot(p)
if err != nil {
return nil, err
}
return rp.parseRoot(p)
}
func (rp *Parser) parseRoot(p *xpp.XMLPullParser) (*Feed, error) {
rssErr := p.Expect(xpp.StartTag, "rss")
rdfErr := p.Expect(xpp.StartTag, "rdf")
if rssErr != nil && rdfErr != nil {
return nil, fmt.Errorf("%s or %s", rssErr.Error(), rdfErr.Error())
}
// Items found in feed root
var channel *Feed
var textinput *TextInput
var image *Image
items := []*Item{}
ver := rp.parseVersion(p)
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
// Skip any extensions found in the feed root.
if shared.IsExtension(p) {
p.Skip()
continue
}
name := strings.ToLower(p.Name)
if name == "channel" {
channel, err = rp.parseChannel(p)
if err != nil {
return nil, err
}
} else if name == "item" {
item, err := rp.parseItem(p)
if err != nil {
return nil, err
}
items = append(items, item)
} else if name == "textinput" {
textinput, err = rp.parseTextInput(p)
if err != nil {
return nil, err
}
} else if name == "image" {
image, err = rp.parseImage(p)
if err != nil {
return nil, err
}
} else {
p.Skip()
}
}
}
rssErr = p.Expect(xpp.EndTag, "rss")
rdfErr = p.Expect(xpp.EndTag, "rdf")
if rssErr != nil && rdfErr != nil {
return nil, fmt.Errorf("%s or %s", rssErr.Error(), rdfErr.Error())
}
if channel == nil {
channel = &Feed{}
channel.Items = []*Item{}
}
if len(items) > 0 {
channel.Items = append(channel.Items, items...)
}
if textinput != nil {
channel.TextInput = textinput
}
if image != nil {
channel.Image = image
}
channel.Version = ver
return channel, nil
}
func (rp *Parser) parseChannel(p *xpp.XMLPullParser) (rss *Feed, err error) {
if err = p.Expect(xpp.StartTag, "channel"); err != nil {
return nil, err
}
rss = &Feed{}
rss.Items = []*Item{}
extensions := ext.Extensions{}
categories := []*Category{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if shared.IsExtension(p) {
ext, err := shared.ParseExtension(extensions, p)
if err != nil {
return nil, err
}
extensions = ext
} else if name == "title" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Title = result
} else if name == "description" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Description = result
} else if name == "link" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Link = result
} else if name == "language" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Language = result
} else if name == "copyright" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Copyright = result
} else if name == "managingeditor" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.ManagingEditor = result
} else if name == "webmaster" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.WebMaster = result
} else if name == "pubdate" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.PubDate = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
rss.PubDateParsed = &utcDate
}
} else if name == "lastbuilddate" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.LastBuildDate = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
rss.LastBuildDateParsed = &utcDate
}
} else if name == "generator" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Generator = result
} else if name == "docs" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Docs = result
} else if name == "ttl" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.TTL = result
} else if name == "rating" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
rss.Rating = result
} else if name == "skiphours" {
result, err := rp.parseSkipHours(p)
if err != nil {
return nil, err
}
rss.SkipHours = result
} else if name == "skipdays" {
result, err := rp.parseSkipDays(p)
if err != nil {
return nil, err
}
rss.SkipDays = result
} else if name == "item" {
result, err := rp.parseItem(p)
if err != nil {
return nil, err
}
rss.Items = append(rss.Items, result)
} else if name == "cloud" {
result, err := rp.parseCloud(p)
if err != nil {
return nil, err
}
rss.Cloud = result
} else if name == "category" {
result, err := rp.parseCategory(p)
if err != nil {
return nil, err
}
categories = append(categories, result)
} else if name == "image" {
result, err := rp.parseImage(p)
if err != nil {
return nil, err
}
rss.Image = result
} else if name == "textinput" {
result, err := rp.parseTextInput(p)
if err != nil {
return nil, err
}
rss.TextInput = result
} else {
// Skip element as it isn't an extension and not
// part of the spec
p.Skip()
}
}
}
if err = p.Expect(xpp.EndTag, "channel"); err != nil {
return nil, err
}
if len(categories) > 0 {
rss.Categories = categories
}
if len(extensions) > 0 {
rss.Extensions = extensions
if itunes, ok := rss.Extensions["itunes"]; ok {
rss.ITunesExt = ext.NewITunesFeedExtension(itunes)
}
if dc, ok := rss.Extensions["dc"]; ok {
rss.DublinCoreExt = ext.NewDublinCoreExtension(dc)
}
}
return rss, nil
}
func (rp *Parser) parseItem(p *xpp.XMLPullParser) (item *Item, err error) {
if err = p.Expect(xpp.StartTag, "item"); err != nil {
return nil, err
}
item = &Item{}
extensions := ext.Extensions{}
categories := []*Category{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if shared.IsExtension(p) {
ext, err := shared.ParseExtension(extensions, p)
if err != nil {
return nil, err
}
item.Extensions = ext
} else if name == "title" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Title = result
} else if name == "description" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Description = result
} else if name == "encoded" {
space := strings.TrimSpace(p.Space)
if prefix, ok := p.Spaces[space]; ok && prefix == "content" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Content = result
}
} else if name == "link" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Link = result
} else if name == "author" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Author = result
} else if name == "comments" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.Comments = result
} else if name == "pubdate" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
item.PubDate = result
date, err := shared.ParseDate(result)
if err == nil {
utcDate := date.UTC()
item.PubDateParsed = &utcDate
}
} else if name == "source" {
result, err := rp.parseSource(p)
if err != nil {
return nil, err
}
item.Source = result
} else if name == "enclosure" {
result, err := rp.parseEnclosure(p)
if err != nil {
return nil, err
}
item.Enclosure = result
} else if name == "guid" {
result, err := rp.parseGUID(p)
if err != nil {
return nil, err
}
item.GUID = result
} else if name == "category" {
result, err := rp.parseCategory(p)
if err != nil {
return nil, err
}
categories = append(categories, result)
} else {
// Skip any elements not part of the item spec
p.Skip()
}
}
}
if len(categories) > 0 {
item.Categories = categories
}
if len(extensions) > 0 {
item.Extensions = extensions
if itunes, ok := item.Extensions["itunes"]; ok {
item.ITunesExt = ext.NewITunesItemExtension(itunes)
}
if dc, ok := item.Extensions["dc"]; ok {
item.DublinCoreExt = ext.NewDublinCoreExtension(dc)
}
}
if err = p.Expect(xpp.EndTag, "item"); err != nil {
return nil, err
}
return item, nil
}
func (rp *Parser) parseSource(p *xpp.XMLPullParser) (source *Source, err error) {
if err = p.Expect(xpp.StartTag, "source"); err != nil {
return nil, err
}
source = &Source{}
source.URL = p.Attribute("url")
result, err := shared.ParseText(p)
if err != nil {
return source, err
}
source.Title = result
if err = p.Expect(xpp.EndTag, "source"); err != nil {
return nil, err
}
return source, nil
}
func (rp *Parser) parseEnclosure(p *xpp.XMLPullParser) (enclosure *Enclosure, err error) {
if err = p.Expect(xpp.StartTag, "enclosure"); err != nil {
return nil, err
}
enclosure = &Enclosure{}
enclosure.URL = p.Attribute("url")
enclosure.Length = p.Attribute("length")
enclosure.Type = p.Attribute("type")
// Ignore any enclosure text
_, err = p.NextText()
if err != nil {
return enclosure, err
}
if err = p.Expect(xpp.EndTag, "enclosure"); err != nil {
return nil, err
}
return enclosure, nil
}
func (rp *Parser) parseImage(p *xpp.XMLPullParser) (image *Image, err error) {
if err = p.Expect(xpp.StartTag, "image"); err != nil {
return nil, err
}
image = &Image{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return image, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if name == "url" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.URL = result
} else if name == "title" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.Title = result
} else if name == "link" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.Link = result
} else if name == "width" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.Width = result
} else if name == "height" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.Height = result
} else if name == "description" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
image.Description = result
} else {
p.Skip()
}
}
}
if err = p.Expect(xpp.EndTag, "image"); err != nil {
return nil, err
}
return image, nil
}
func (rp *Parser) parseGUID(p *xpp.XMLPullParser) (guid *GUID, err error) {
if err = p.Expect(xpp.StartTag, "guid"); err != nil {
return nil, err
}
guid = &GUID{}
guid.IsPermalink = p.Attribute("isPermalink")
result, err := shared.ParseText(p)
if err != nil {
return
}
guid.Value = result
if err = p.Expect(xpp.EndTag, "guid"); err != nil {
return nil, err
}
return guid, nil
}
func (rp *Parser) parseCategory(p *xpp.XMLPullParser) (cat *Category, err error) {
if err = p.Expect(xpp.StartTag, "category"); err != nil {
return nil, err
}
cat = &Category{}
cat.Domain = p.Attribute("domain")
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
cat.Value = result
if err = p.Expect(xpp.EndTag, "category"); err != nil {
return nil, err
}
return cat, nil
}
func (rp *Parser) parseTextInput(p *xpp.XMLPullParser) (*TextInput, error) {
if err := p.Expect(xpp.StartTag, "textinput"); err != nil {
return nil, err
}
ti := &TextInput{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if name == "title" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
ti.Title = result
} else if name == "description" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
ti.Description = result
} else if name == "name" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
ti.Name = result
} else if name == "link" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
ti.Link = result
} else {
p.Skip()
}
}
}
if err := p.Expect(xpp.EndTag, "textinput"); err != nil {
return nil, err
}
return ti, nil
}
func (rp *Parser) parseSkipHours(p *xpp.XMLPullParser) ([]string, error) {
if err := p.Expect(xpp.StartTag, "skiphours"); err != nil {
return nil, err
}
hours := []string{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if name == "hour" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
hours = append(hours, result)
} else {
p.Skip()
}
}
}
if err := p.Expect(xpp.EndTag, "skiphours"); err != nil {
return nil, err
}
return hours, nil
}
func (rp *Parser) parseSkipDays(p *xpp.XMLPullParser) ([]string, error) {
if err := p.Expect(xpp.StartTag, "skipdays"); err != nil {
return nil, err
}
days := []string{}
for {
tok, err := rp.base.NextTag(p)
if err != nil {
return nil, err
}
if tok == xpp.EndTag {
break
}
if tok == xpp.StartTag {
name := strings.ToLower(p.Name)
if name == "day" {
result, err := shared.ParseText(p)
if err != nil {
return nil, err
}
days = append(days, result)
} else {
p.Skip()
}
}
}
if err := p.Expect(xpp.EndTag, "skipdays"); err != nil {
return nil, err
}
return days, nil
}
func (rp *Parser) parseCloud(p *xpp.XMLPullParser) (*Cloud, error) {
if err := p.Expect(xpp.StartTag, "cloud"); err != nil {
return nil, err
}
cloud := &Cloud{}
cloud.Domain = p.Attribute("domain")
cloud.Port = p.Attribute("port")
cloud.Path = p.Attribute("path")
cloud.RegisterProcedure = p.Attribute("registerProcedure")
cloud.Protocol = p.Attribute("protocol")
rp.base.NextTag(p)
if err := p.Expect(xpp.EndTag, "cloud"); err != nil {
return nil, err
}
return cloud, nil
}
func (rp *Parser) parseVersion(p *xpp.XMLPullParser) (ver string) {
name := strings.ToLower(p.Name)
if name == "rss" {
ver = p.Attribute("version")
} else if name == "rdf" {
ns := p.Attribute("xmlns")
if ns == "http://channel.netscape.com/rdf/simple/0.9/" ||
ns == "http://my.netscape.com/rdf/simple/0.9/" {
ver = "0.9"
} else if ns == "http://purl.org/rss/1.0/" {
ver = "1.0"
}
}
return
}

686
vendor/github.com/mmcdole/gofeed/translator.go generated vendored Normal file
View File

@@ -0,0 +1,686 @@
package gofeed
import (
"fmt"
"strings"
"time"
"github.com/mmcdole/gofeed/atom"
ext "github.com/mmcdole/gofeed/extensions"
"github.com/mmcdole/gofeed/internal/shared"
"github.com/mmcdole/gofeed/rss"
)
// Translator converts a particular feed (atom.Feed or rss.Feed)
// into the generic Feed struct
type Translator interface {
Translate(feed interface{}) (*Feed, error)
}
// DefaultRSSTranslator converts an rss.Feed struct
// into the generic Feed struct.
//
// This default implementation defines a set of
// mapping rules between rss.Feed -> Feed
// for each of the fields in Feed.
type DefaultRSSTranslator struct{}
// Translate converts an RSS feed into the universal
// feed type.
func (t *DefaultRSSTranslator) Translate(feed interface{}) (*Feed, error) {
rss, found := feed.(*rss.Feed)
if !found {
return nil, fmt.Errorf("Feed did not match expected type of *rss.Feed")
}
result := &Feed{}
result.Title = t.translateFeedTitle(rss)
result.Description = t.translateFeedDescription(rss)
result.Link = t.translateFeedLink(rss)
result.FeedLink = t.translateFeedFeedLink(rss)
result.Updated = t.translateFeedUpdated(rss)
result.UpdatedParsed = t.translateFeedUpdatedParsed(rss)
result.Published = t.translateFeedPublished(rss)
result.PublishedParsed = t.translateFeedPublishedParsed(rss)
result.Author = t.translateFeedAuthor(rss)
result.Language = t.translateFeedLanguage(rss)
result.Image = t.translateFeedImage(rss)
result.Copyright = t.translateFeedCopyright(rss)
result.Generator = t.translateFeedGenerator(rss)
result.Categories = t.translateFeedCategories(rss)
result.Items = t.translateFeedItems(rss)
result.ITunesExt = rss.ITunesExt
result.DublinCoreExt = rss.DublinCoreExt
result.Extensions = rss.Extensions
result.FeedVersion = rss.Version
result.FeedType = "rss"
return result, nil
}
func (t *DefaultRSSTranslator) translateFeedItem(rssItem *rss.Item) (item *Item) {
item = &Item{}
item.Title = t.translateItemTitle(rssItem)
item.Description = t.translateItemDescription(rssItem)
item.Content = t.translateItemContent(rssItem)
item.Link = t.translateItemLink(rssItem)
item.Published = t.translateItemPublished(rssItem)
item.PublishedParsed = t.translateItemPublishedParsed(rssItem)
item.Author = t.translateItemAuthor(rssItem)
item.GUID = t.translateItemGUID(rssItem)
item.Image = t.translateItemImage(rssItem)
item.Categories = t.translateItemCategories(rssItem)
item.Enclosures = t.translateItemEnclosures(rssItem)
item.DublinCoreExt = rssItem.DublinCoreExt
item.ITunesExt = rssItem.ITunesExt
item.Extensions = rssItem.Extensions
return
}
func (t *DefaultRSSTranslator) translateFeedTitle(rss *rss.Feed) (title string) {
if rss.Title != "" {
title = rss.Title
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Title != nil {
title = t.firstEntry(rss.DublinCoreExt.Title)
}
return
}
func (t *DefaultRSSTranslator) translateFeedDescription(rss *rss.Feed) (desc string) {
return rss.Description
}
func (t *DefaultRSSTranslator) translateFeedLink(rss *rss.Feed) (link string) {
if rss.Link != "" {
link = rss.Link
} else if rss.ITunesExt != nil && rss.ITunesExt.Subtitle != "" {
link = rss.ITunesExt.Subtitle
}
return
}
func (t *DefaultRSSTranslator) translateFeedFeedLink(rss *rss.Feed) (link string) {
atomExtensions := t.extensionsForKeys([]string{"atom", "atom10", "atom03"}, rss.Extensions)
for _, ex := range atomExtensions {
if links, ok := ex["link"]; ok {
for _, l := range links {
if l.Attrs["Rel"] == "self" {
link = l.Value
}
}
}
}
return
}
func (t *DefaultRSSTranslator) translateFeedUpdated(rss *rss.Feed) (updated string) {
if rss.LastBuildDate != "" {
updated = rss.LastBuildDate
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Date != nil {
updated = t.firstEntry(rss.DublinCoreExt.Date)
}
return
}
func (t *DefaultRSSTranslator) translateFeedUpdatedParsed(rss *rss.Feed) (updated *time.Time) {
if rss.LastBuildDateParsed != nil {
updated = rss.LastBuildDateParsed
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Date != nil {
dateText := t.firstEntry(rss.DublinCoreExt.Date)
date, err := shared.ParseDate(dateText)
if err == nil {
updated = &date
}
}
return
}
func (t *DefaultRSSTranslator) translateFeedPublished(rss *rss.Feed) (published string) {
return rss.PubDate
}
func (t *DefaultRSSTranslator) translateFeedPublishedParsed(rss *rss.Feed) (published *time.Time) {
return rss.PubDateParsed
}
func (t *DefaultRSSTranslator) translateFeedAuthor(rss *rss.Feed) (author *Person) {
if rss.ManagingEditor != "" {
name, address := shared.ParseNameAddress(rss.ManagingEditor)
author = &Person{}
author.Name = name
author.Email = address
} else if rss.WebMaster != "" {
name, address := shared.ParseNameAddress(rss.WebMaster)
author = &Person{}
author.Name = name
author.Email = address
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Author != nil {
dcAuthor := t.firstEntry(rss.DublinCoreExt.Author)
name, address := shared.ParseNameAddress(dcAuthor)
author = &Person{}
author.Name = name
author.Email = address
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Creator != nil {
dcCreator := t.firstEntry(rss.DublinCoreExt.Creator)
name, address := shared.ParseNameAddress(dcCreator)
author = &Person{}
author.Name = name
author.Email = address
} else if rss.ITunesExt != nil && rss.ITunesExt.Author != "" {
name, address := shared.ParseNameAddress(rss.ITunesExt.Author)
author = &Person{}
author.Name = name
author.Email = address
}
return
}
func (t *DefaultRSSTranslator) translateFeedLanguage(rss *rss.Feed) (language string) {
if rss.Language != "" {
language = rss.Language
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Language != nil {
language = t.firstEntry(rss.DublinCoreExt.Language)
}
return
}
func (t *DefaultRSSTranslator) translateFeedImage(rss *rss.Feed) (image *Image) {
if rss.Image != nil {
image = &Image{}
image.Title = rss.Image.Title
image.URL = rss.Image.URL
} else if rss.ITunesExt != nil && rss.ITunesExt.Image != "" {
image = &Image{}
image.URL = rss.ITunesExt.Image
}
return
}
func (t *DefaultRSSTranslator) translateFeedCopyright(rss *rss.Feed) (rights string) {
if rss.Copyright != "" {
rights = rss.Copyright
} else if rss.DublinCoreExt != nil && rss.DublinCoreExt.Rights != nil {
rights = t.firstEntry(rss.DublinCoreExt.Rights)
}
return
}
func (t *DefaultRSSTranslator) translateFeedGenerator(rss *rss.Feed) (generator string) {
return rss.Generator
}
func (t *DefaultRSSTranslator) translateFeedCategories(rss *rss.Feed) (categories []string) {
cats := []string{}
if rss.Categories != nil {
for _, c := range rss.Categories {
cats = append(cats, c.Value)
}
}
if rss.ITunesExt != nil && rss.ITunesExt.Keywords != "" {
keywords := strings.Split(rss.ITunesExt.Keywords, ",")
for _, k := range keywords {
cats = append(cats, k)
}
}
if rss.ITunesExt != nil && rss.ITunesExt.Categories != nil {
for _, c := range rss.ITunesExt.Categories {
cats = append(cats, c.Text)
if c.Subcategory != nil {
cats = append(cats, c.Subcategory.Text)
}
}
}
if rss.DublinCoreExt != nil && rss.DublinCoreExt.Subject != nil {
for _, c := range rss.DublinCoreExt.Subject {
cats = append(cats, c)
}
}
if len(cats) > 0 {
categories = cats
}
return
}
func (t *DefaultRSSTranslator) translateFeedItems(rss *rss.Feed) (items []*Item) {
items = []*Item{}
for _, i := range rss.Items {
items = append(items, t.translateFeedItem(i))
}
return
}
func (t *DefaultRSSTranslator) translateItemTitle(rssItem *rss.Item) (title string) {
if rssItem.Title != "" {
title = rssItem.Title
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Title != nil {
title = t.firstEntry(rssItem.DublinCoreExt.Title)
}
return
}
func (t *DefaultRSSTranslator) translateItemDescription(rssItem *rss.Item) (desc string) {
if rssItem.Description != "" {
desc = rssItem.Description
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Description != nil {
desc = t.firstEntry(rssItem.DublinCoreExt.Description)
}
return
}
func (t *DefaultRSSTranslator) translateItemContent(rssItem *rss.Item) (content string) {
return rssItem.Content
}
func (t *DefaultRSSTranslator) translateItemLink(rssItem *rss.Item) (link string) {
return rssItem.Link
}
func (t *DefaultRSSTranslator) translateItemUpdated(rssItem *rss.Item) (updated string) {
if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Date != nil {
updated = t.firstEntry(rssItem.DublinCoreExt.Date)
}
return updated
}
func (t *DefaultRSSTranslator) translateItemUpdatedParsed(rssItem *rss.Item) (updated *time.Time) {
if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Date != nil {
updatedText := t.firstEntry(rssItem.DublinCoreExt.Date)
updatedDate, err := shared.ParseDate(updatedText)
if err == nil {
updated = &updatedDate
}
}
return
}
func (t *DefaultRSSTranslator) translateItemPublished(rssItem *rss.Item) (pubDate string) {
if rssItem.PubDate != "" {
return rssItem.PubDate
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Date != nil {
return t.firstEntry(rssItem.DublinCoreExt.Date)
}
return
}
func (t *DefaultRSSTranslator) translateItemPublishedParsed(rssItem *rss.Item) (pubDate *time.Time) {
if rssItem.PubDateParsed != nil {
return rssItem.PubDateParsed
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Date != nil {
pubDateText := t.firstEntry(rssItem.DublinCoreExt.Date)
pubDateParsed, err := shared.ParseDate(pubDateText)
if err == nil {
pubDate = &pubDateParsed
}
}
return
}
func (t *DefaultRSSTranslator) translateItemAuthor(rssItem *rss.Item) (author *Person) {
if rssItem.Author != "" {
name, address := shared.ParseNameAddress(rssItem.Author)
author = &Person{}
author.Name = name
author.Email = address
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Author != nil {
dcAuthor := t.firstEntry(rssItem.DublinCoreExt.Author)
name, address := shared.ParseNameAddress(dcAuthor)
author = &Person{}
author.Name = name
author.Email = address
} else if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Creator != nil {
dcCreator := t.firstEntry(rssItem.DublinCoreExt.Creator)
name, address := shared.ParseNameAddress(dcCreator)
author = &Person{}
author.Name = name
author.Email = address
} else if rssItem.ITunesExt != nil && rssItem.ITunesExt.Author != "" {
name, address := shared.ParseNameAddress(rssItem.ITunesExt.Author)
author = &Person{}
author.Name = name
author.Email = address
}
return
}
func (t *DefaultRSSTranslator) translateItemGUID(rssItem *rss.Item) (guid string) {
if rssItem.GUID != nil {
guid = rssItem.GUID.Value
}
return
}
func (t *DefaultRSSTranslator) translateItemImage(rssItem *rss.Item) (image *Image) {
if rssItem.ITunesExt != nil && rssItem.ITunesExt.Image != "" {
image = &Image{}
image.URL = rssItem.ITunesExt.Image
}
return
}
func (t *DefaultRSSTranslator) translateItemCategories(rssItem *rss.Item) (categories []string) {
cats := []string{}
if rssItem.Categories != nil {
for _, c := range rssItem.Categories {
cats = append(cats, c.Value)
}
}
if rssItem.ITunesExt != nil && rssItem.ITunesExt.Keywords != "" {
keywords := strings.Split(rssItem.ITunesExt.Keywords, ",")
for _, k := range keywords {
cats = append(cats, k)
}
}
if rssItem.DublinCoreExt != nil && rssItem.DublinCoreExt.Subject != nil {
for _, c := range rssItem.DublinCoreExt.Subject {
cats = append(cats, c)
}
}
if len(cats) > 0 {
categories = cats
}
return
}
func (t *DefaultRSSTranslator) translateItemEnclosures(rssItem *rss.Item) (enclosures []*Enclosure) {
if rssItem.Enclosure != nil {
e := &Enclosure{}
e.URL = rssItem.Enclosure.URL
e.Type = rssItem.Enclosure.Type
e.Length = rssItem.Enclosure.Length
enclosures = []*Enclosure{e}
}
return
}
func (t *DefaultRSSTranslator) extensionsForKeys(keys []string, extensions ext.Extensions) (matches []map[string][]ext.Extension) {
matches = []map[string][]ext.Extension{}
if extensions == nil {
return
}
for _, key := range keys {
if match, ok := extensions[key]; ok {
matches = append(matches, match)
}
}
return
}
func (t *DefaultRSSTranslator) firstEntry(entries []string) (value string) {
if entries == nil {
return
}
if len(entries) == 0 {
return
}
return entries[0]
}
// DefaultAtomTranslator converts an atom.Feed struct
// into the generic Feed struct.
//
// This default implementation defines a set of
// mapping rules between atom.Feed -> Feed
// for each of the fields in Feed.
type DefaultAtomTranslator struct{}
// Translate converts an Atom feed into the universal
// feed type.
func (t *DefaultAtomTranslator) Translate(feed interface{}) (*Feed, error) {
atom, found := feed.(*atom.Feed)
if !found {
return nil, fmt.Errorf("Feed did not match expected type of *atom.Feed")
}
result := &Feed{}
result.Title = t.translateFeedTitle(atom)
result.Description = t.translateFeedDescription(atom)
result.Link = t.translateFeedLink(atom)
result.FeedLink = t.translateFeedFeedLink(atom)
result.Updated = t.translateFeedUpdated(atom)
result.UpdatedParsed = t.translateFeedUpdatedParsed(atom)
result.Author = t.translateFeedAuthor(atom)
result.Language = t.translateFeedLanguage(atom)
result.Image = t.translateFeedImage(atom)
result.Copyright = t.translateFeedCopyright(atom)
result.Categories = t.translateFeedCategories(atom)
result.Generator = t.translateFeedGenerator(atom)
result.Items = t.translateFeedItems(atom)
result.Extensions = atom.Extensions
result.FeedVersion = atom.Version
result.FeedType = "atom"
return result, nil
}
func (t *DefaultAtomTranslator) translateFeedItem(entry *atom.Entry) (item *Item) {
item = &Item{}
item.Title = t.translateItemTitle(entry)
item.Description = t.translateItemDescription(entry)
item.Content = t.translateItemContent(entry)
item.Link = t.translateItemLink(entry)
item.Updated = t.translateItemUpdated(entry)
item.UpdatedParsed = t.translateItemUpdatedParsed(entry)
item.Published = t.translateItemPublished(entry)
item.PublishedParsed = t.translateItemPublishedParsed(entry)
item.Author = t.translateItemAuthor(entry)
item.GUID = t.translateItemGUID(entry)
item.Image = t.translateItemImage(entry)
item.Categories = t.translateItemCategories(entry)
item.Enclosures = t.translateItemEnclosures(entry)
item.Extensions = entry.Extensions
return
}
func (t *DefaultAtomTranslator) translateFeedTitle(atom *atom.Feed) (title string) {
return atom.Title
}
func (t *DefaultAtomTranslator) translateFeedDescription(atom *atom.Feed) (desc string) {
return atom.Subtitle
}
func (t *DefaultAtomTranslator) translateFeedLink(atom *atom.Feed) (link string) {
l := t.firstLinkWithType("alternate", atom.Links)
if l != nil {
link = l.Href
}
return
}
func (t *DefaultAtomTranslator) translateFeedFeedLink(atom *atom.Feed) (link string) {
feedLink := t.firstLinkWithType("self", atom.Links)
if feedLink != nil {
link = feedLink.Href
}
return
}
func (t *DefaultAtomTranslator) translateFeedUpdated(atom *atom.Feed) (updated string) {
return atom.Updated
}
func (t *DefaultAtomTranslator) translateFeedUpdatedParsed(atom *atom.Feed) (updated *time.Time) {
return atom.UpdatedParsed
}
func (t *DefaultAtomTranslator) translateFeedAuthor(atom *atom.Feed) (author *Person) {
a := t.firstPerson(atom.Authors)
if a != nil {
feedAuthor := Person{}
feedAuthor.Name = a.Name
feedAuthor.Email = a.Email
author = &feedAuthor
}
return
}
func (t *DefaultAtomTranslator) translateFeedLanguage(atom *atom.Feed) (language string) {
return atom.Language
}
func (t *DefaultAtomTranslator) translateFeedImage(atom *atom.Feed) (image *Image) {
if atom.Logo != "" {
feedImage := Image{}
feedImage.URL = atom.Logo
image = &feedImage
}
return
}
func (t *DefaultAtomTranslator) translateFeedCopyright(atom *atom.Feed) (rights string) {
return atom.Rights
}
func (t *DefaultAtomTranslator) translateFeedGenerator(atom *atom.Feed) (generator string) {
if atom.Generator != nil {
if atom.Generator.Value != "" {
generator += atom.Generator.Value
}
if atom.Generator.Version != "" {
generator += " v" + atom.Generator.Version
}
if atom.Generator.URI != "" {
generator += " " + atom.Generator.URI
}
generator = strings.TrimSpace(generator)
}
return
}
func (t *DefaultAtomTranslator) translateFeedCategories(atom *atom.Feed) (categories []string) {
if atom.Categories != nil {
categories = []string{}
for _, c := range atom.Categories {
categories = append(categories, c.Term)
}
}
return
}
func (t *DefaultAtomTranslator) translateFeedItems(atom *atom.Feed) (items []*Item) {
items = []*Item{}
for _, entry := range atom.Entries {
items = append(items, t.translateFeedItem(entry))
}
return
}
func (t *DefaultAtomTranslator) translateItemTitle(entry *atom.Entry) (title string) {
return entry.Title
}
func (t *DefaultAtomTranslator) translateItemDescription(entry *atom.Entry) (desc string) {
return entry.Summary
}
func (t *DefaultAtomTranslator) translateItemContent(entry *atom.Entry) (content string) {
if entry.Content != nil {
content = entry.Content.Value
}
return
}
func (t *DefaultAtomTranslator) translateItemLink(entry *atom.Entry) (link string) {
l := t.firstLinkWithType("alternate", entry.Links)
if l != nil {
link = l.Href
}
return
}
func (t *DefaultAtomTranslator) translateItemUpdated(entry *atom.Entry) (updated string) {
return entry.Updated
}
func (t *DefaultAtomTranslator) translateItemUpdatedParsed(entry *atom.Entry) (updated *time.Time) {
return entry.UpdatedParsed
}
func (t *DefaultAtomTranslator) translateItemPublished(entry *atom.Entry) (updated string) {
return entry.Published
}
func (t *DefaultAtomTranslator) translateItemPublishedParsed(entry *atom.Entry) (updated *time.Time) {
return entry.PublishedParsed
}
func (t *DefaultAtomTranslator) translateItemAuthor(entry *atom.Entry) (author *Person) {
a := t.firstPerson(entry.Authors)
if a != nil {
author = &Person{}
author.Name = a.Name
author.Email = a.Email
}
return
}
func (t *DefaultAtomTranslator) translateItemGUID(entry *atom.Entry) (guid string) {
return entry.ID
}
func (t *DefaultAtomTranslator) translateItemImage(entry *atom.Entry) (image *Image) {
return nil
}
func (t *DefaultAtomTranslator) translateItemCategories(entry *atom.Entry) (categories []string) {
if entry.Categories != nil {
categories = []string{}
for _, c := range entry.Categories {
categories = append(categories, c.Term)
}
}
return
}
func (t *DefaultAtomTranslator) translateItemEnclosures(entry *atom.Entry) (enclosures []*Enclosure) {
if entry.Links != nil {
enclosures = []*Enclosure{}
for _, e := range entry.Links {
if e.Rel == "enclosure" {
enclosure := &Enclosure{}
enclosure.URL = e.Href
enclosure.Length = e.Length
enclosure.Type = e.Type
enclosures = append(enclosures, enclosure)
}
}
if len(enclosures) == 0 {
enclosures = nil
}
}
return
}
func (t *DefaultAtomTranslator) firstLinkWithType(linkType string, links []*atom.Link) *atom.Link {
if links == nil {
return nil
}
for _, link := range links {
if link.Rel == linkType {
return link
}
}
return nil
}
func (t *DefaultAtomTranslator) firstPerson(persons []*atom.Person) (person *atom.Person) {
if persons == nil || len(persons) == 0 {
return
}
person = persons[0]
return
}

21
vendor/github.com/mmcdole/goxpp/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2016 mmcdole
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

9
vendor/github.com/mmcdole/goxpp/README.md generated vendored Normal file
View File

@@ -0,0 +1,9 @@
# goxpp
[![Build Status](https://travis-ci.org/mmcdole/goxpp.svg?branch=master)](https://travis-ci.org/mmcdole/goxpp) [![Coverage Status](https://coveralls.io/repos/github/mmcdole/goxpp/badge.svg?branch=master)](https://coveralls.io/github/mmcdole/goxpp?branch=master) [![License](http://img.shields.io/:license-mit-blue.svg)](http://doge.mit-license.org)
[![GoDoc](https://godoc.org/github.com/mmcdole/goxpp?status.svg)](https://godoc.org/github.com/mmcdole/goxpp)
The `goxpp` library is an XML parser library that is loosely based on the [Java XMLPullParser](http://www.xmlpull.org/v1/download/unpacked/doc/quick_intro.html). This library allows you to easily parse arbitary XML content using a pull parser. You can think of `goxpp` as a lightweight wrapper around Go's XML `Decoder` that provides a set of functions that make it easier to parse XML content than using the raw decoder itself.
This project is licensed under the [MIT License](https://raw.githubusercontent.com/mmcdole/goxpp/master/LICENSE)

342
vendor/github.com/mmcdole/goxpp/xpp.go generated vendored Normal file
View File

@@ -0,0 +1,342 @@
package xpp
import (
"encoding/xml"
"errors"
"fmt"
"io"
"strings"
)
type XMLEventType int
type CharsetReader func(charset string, input io.Reader) (io.Reader, error)
const (
StartDocument XMLEventType = iota
EndDocument
StartTag
EndTag
Text
Comment
ProcessingInstruction
Directive
IgnorableWhitespace // TODO: ?
// TODO: CDSECT ?
)
type XMLPullParser struct {
// Document State
Spaces map[string]string
SpacesStack []map[string]string
// Token State
Depth int
Event XMLEventType
Attrs []xml.Attr
Name string
Space string
Text string
decoder *xml.Decoder
token interface{}
}
func NewXMLPullParser(r io.Reader, strict bool, cr CharsetReader) *XMLPullParser {
d := xml.NewDecoder(r)
d.Strict = strict
d.CharsetReader = cr
return &XMLPullParser{
decoder: d,
Event: StartDocument,
Depth: 0,
Spaces: map[string]string{},
}
}
func (p *XMLPullParser) NextTag() (event XMLEventType, err error) {
t, err := p.Next()
if err != nil {
return event, err
}
for t == Text && p.IsWhitespace() {
t, err = p.Next()
if err != nil {
return event, err
}
}
if t != StartTag && t != EndTag {
return event, fmt.Errorf("Expected StartTag or EndTag but got %s at offset: %d", p.EventName(t), p.decoder.InputOffset())
}
return t, nil
}
func (p *XMLPullParser) Next() (event XMLEventType, err error) {
for {
event, err = p.NextToken()
if err != nil {
return event, err
}
// Return immediately after encountering a StartTag
// EndTag, Text, EndDocument
if event == StartTag ||
event == EndTag ||
event == EndDocument ||
event == Text {
return event, nil
}
// Skip Comment/Directive and ProcessingInstruction
if event == Comment ||
event == Directive ||
event == ProcessingInstruction {
continue
}
}
return event, nil
}
func (p *XMLPullParser) NextToken() (event XMLEventType, err error) {
// Clear any state held for the previous token
p.resetTokenState()
token, err := p.decoder.Token()
if err != nil {
if err == io.EOF {
// XML decoder returns the EOF as an error
// but we want to return it as a valid
// EndDocument token instead
p.token = nil
p.Event = EndDocument
return p.Event, nil
}
return event, err
}
p.token = xml.CopyToken(token)
p.processToken(p.token)
p.Event = p.EventType(p.token)
return p.Event, nil
}
func (p *XMLPullParser) NextText() (string, error) {
if p.Event != StartTag {
return "", errors.New("Parser must be on StartTag to get NextText()")
}
t, err := p.Next()
if err != nil {
return "", err
}
if t != EndTag && t != Text {
return "", errors.New("Parser must be on EndTag or Text to read text")
}
var result string
for t == Text {
result = result + p.Text
t, err = p.Next()
if err != nil {
return "", err
}
if t != EndTag && t != Text {
errstr := fmt.Sprintf("Event Text must be immediately followed by EndTag or Text but got %s", p.EventName(t))
return "", errors.New(errstr)
}
}
return result, nil
}
func (p *XMLPullParser) Skip() error {
for {
tok, err := p.NextToken()
if err != nil {
return err
}
if tok == StartTag {
if err := p.Skip(); err != nil {
return err
}
} else if tok == EndTag {
return nil
}
}
}
func (p *XMLPullParser) Attribute(name string) string {
for _, attr := range p.Attrs {
if attr.Name.Local == name {
return attr.Value
}
}
return ""
}
func (p *XMLPullParser) Expect(event XMLEventType, name string) (err error) {
return p.ExpectAll(event, "*", name)
}
func (p *XMLPullParser) ExpectAll(event XMLEventType, space string, name string) (err error) {
if !(p.Event == event && (strings.ToLower(p.Space) == strings.ToLower(space) || space == "*") && (strings.ToLower(p.Name) == strings.ToLower(name) || name == "*")) {
err = fmt.Errorf("Expected Space:%s Name:%s Event:%s but got Space:%s Name:%s Event:%s at offset: %d", space, name, p.EventName(event), p.Space, p.Name, p.EventName(p.Event), p.decoder.InputOffset())
}
return
}
func (p *XMLPullParser) DecodeElement(v interface{}) error {
if p.Event != StartTag {
return errors.New("DecodeElement can only be called from a StartTag event")
}
//tok := &p.token
startToken := p.token.(xml.StartElement)
// Consumes all tokens until the matching end token.
err := p.decoder.DecodeElement(v, &startToken)
if err != nil {
return err
}
name := p.Name
// Need to set the "current" token name/event
// to the previous StartTag event's name
p.resetTokenState()
p.Event = EndTag
p.Depth--
p.Name = name
p.token = nil
return nil
}
func (p *XMLPullParser) IsWhitespace() bool {
return strings.TrimSpace(p.Text) == ""
}
func (p *XMLPullParser) EventName(e XMLEventType) (name string) {
switch e {
case StartTag:
name = "StartTag"
case EndTag:
name = "EndTag"
case StartDocument:
name = "StartDocument"
case EndDocument:
name = "EndDocument"
case ProcessingInstruction:
name = "ProcessingInstruction"
case Directive:
name = "Directive"
case Comment:
name = "Comment"
case Text:
name = "Text"
case IgnorableWhitespace:
name = "IgnorableWhitespace"
}
return
}
func (p *XMLPullParser) EventType(t xml.Token) (event XMLEventType) {
switch t.(type) {
case xml.StartElement:
event = StartTag
case xml.EndElement:
event = EndTag
case xml.CharData:
event = Text
case xml.Comment:
event = Comment
case xml.ProcInst:
event = ProcessingInstruction
case xml.Directive:
event = Directive
}
return
}
func (p *XMLPullParser) processToken(t xml.Token) {
switch tt := t.(type) {
case xml.StartElement:
p.processStartToken(tt)
case xml.EndElement:
p.processEndToken(tt)
case xml.CharData:
p.processCharDataToken(tt)
case xml.Comment:
p.processCommentToken(tt)
case xml.ProcInst:
p.processProcInstToken(tt)
case xml.Directive:
p.processDirectiveToken(tt)
}
}
func (p *XMLPullParser) processStartToken(t xml.StartElement) {
p.Depth++
p.Attrs = t.Attr
p.Name = t.Name.Local
p.Space = t.Name.Space
p.trackNamespaces(t)
}
func (p *XMLPullParser) processEndToken(t xml.EndElement) {
p.Depth--
p.SpacesStack = p.SpacesStack[:len(p.SpacesStack)-1]
if len(p.SpacesStack) == 0 {
p.Spaces = map[string]string{}
} else {
p.Spaces = p.SpacesStack[len(p.SpacesStack)-1]
}
p.Name = t.Name.Local
}
func (p *XMLPullParser) processCharDataToken(t xml.CharData) {
p.Text = string([]byte(t))
}
func (p *XMLPullParser) processCommentToken(t xml.Comment) {
p.Text = string([]byte(t))
}
func (p *XMLPullParser) processProcInstToken(t xml.ProcInst) {
p.Text = fmt.Sprintf("%s %s", t.Target, string(t.Inst))
}
func (p *XMLPullParser) processDirectiveToken(t xml.Directive) {
p.Text = string([]byte(t))
}
func (p *XMLPullParser) resetTokenState() {
p.Attrs = nil
p.Name = ""
p.Space = ""
p.Text = ""
}
func (p *XMLPullParser) trackNamespaces(t xml.StartElement) {
newSpace := map[string]string{}
for k, v := range p.Spaces {
newSpace[k] = v
}
for _, attr := range t.Attr {
if attr.Name.Space == "xmlns" {
space := strings.TrimSpace(attr.Value)
spacePrefix := strings.TrimSpace(strings.ToLower(attr.Name.Local))
newSpace[space] = spacePrefix
} else if attr.Name.Local == "xmlns" {
space := strings.TrimSpace(attr.Value)
newSpace[space] = ""
}
}
p.Spaces = newSpace
p.SpacesStack = append(p.SpacesStack, newSpace)
}