Documentation
¶
Overview ¶
Package sherlock is a library for extracting metadata from web pages. It uses as many methods as possible to extract page data, including: - ActivityStreams/JSON-LD - Open Graph - Microformats2
Coming Soon.. - HTML Meta Tags - oEmbed - JSON-LD - Twitter Cards?
Index ¶
- Constants
- func AuthorizedFetch(publicKeyID string, privateKey crypto.PrivateKey) remote.Option
- func IsValidAddress(address string) bool
- func ParseOEmbed(reader io.Reader, data mapof.Any)
- type Client
- func (client Client) Delete(documentID string) error
- func (client Client) Load(url string, options ...any) (streams.Document, error)
- func (client Client) Save(document streams.Document) error
- func (client Client) SetRootClient(rootClient streams.Client)
- func (client *Client) With(options ...ClientOption)
- type ClientOption
- type Config
- type KeyPairFunc
- type Option
- func AsActor() Option
- func AsCollection() Option
- func AsDocument() Option
- func WithDefaultValue(defaultValue map[string]any) Option
- func WithKeyPair(publicKeyID string, privateKey crypto.PrivateKey) Option
- func WithMaximumRedirects(maximumRedirects int) Option
- func WithRemoteOptions(options ...remote.Option) Option
Constants ¶
const ContentType = "Content-Type"
ContentType is the string used in the HTTP header to designate a MIME type
const ContentTypeActivityPub = "application/activity+json"
ContentTypeActivityPub is the standard MIME type for ActivityPub content
const ContentTypeAtom = "application/atom+xml"
ContentTypeAtom is the standard MIME Type for Atom Feeds
const ContentTypeForm = "application/x-www-form-urlencoded"
ContentTypeForm is the standard MIME Type for Form encoded content
const ContentTypeHTML = "text/html"
ContentTypeHTML is the standard MIME type for HTML content
const ContentTypeJSON = "application/json"
ContentTypeJSON is the standard MIME Type for JSON content
const ContentTypeJSONFeed = "application/feed+json"
ContentTypeJSONFeed is the standard MIME Type for JSON Feed content https://en.wikipedia.org/wiki/JSON_Feed
const ContentTypeJSONLD = "application/ld+json"
ContentTypeJSONLD is the standard MIME Type for JSON-LD content https://en.wikipedia.org/wiki/JSON-LD
const ContentTypeJSONResourceDescriptor = "application/jrd+json"
ContentTypeJSONResourceDescriptor is the standard MIME Type for JSON Resource Descriptor content which is used by WebFinger: https://datatracker.ietf.org/doc/html/rfc7033#section-10.2
const ContentTypePlain = "text/plain"
ContentTypePlain is the default plaintext MIME type
const ContentTypeRSS = "application/rss+xml"
ContentTypeRSS is the standard MIME Type for RSS Feeds
const ContentTypeXML = "application/xml"
ContentTypeXML is the standard MIME Type for XML content
const FormatActivityStream = "ACTIVITYSTREAM"
const FormatJSONFeed = "JSONFEED"
const FormatMicroFormats = "MICROFORMATS"
const FormatRSS = "RSS"
const HTTPHeaderAccept = "Accept"
HTTPHeaderAccept is the string used in the HTTP header to request a response be encoded as a MIME type
const HTTPHeaderCacheControl = "Cache-Control"
const HTTPHeaderLink = "Link"
const IdentifierTypeNone = "NONE"
const IdentifierTypeURL = "URL"
const IdentifierTypeUsername = "USERNAME"
const LinkRelationAlternate = "alternate"
const LinkRelationFeed = "feed"
const LinkRelationHub = "hub"
const LinkRelationIcon = "icon"
const LinkRelationSelf = "self"
Variables ¶
This section is empty.
Functions ¶
func AuthorizedFetch ¶ added in v0.8.0
func AuthorizedFetch(publicKeyID string, privateKey crypto.PrivateKey) remote.Option
AuthorizedFetch is a remote.Option that signs all outbound requests according to the ActivityPub "Authorized Fetch" convention: https://funfedi.dev/testing_tools/http_signatures/
func IsValidAddress ¶ added in v0.6.5
IsValidAddress returns TRUE for all values that Sherlock THINKS it SHOULD be able to prorcess. This includes: @[email protected] and https://host.tld/username addresses. IMPORTANT: Just because this function returns TRUE does NOT mean that the address is valid. It just means that it looks like a valid format, but it will still need to be checked.
Types ¶
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client implements the hannibal/streams.Client interface, and is used to load JSON-LD documents from remote servers. The sherlock client maps additional meta-data into a standard ActivityStreams document.
func NewClient ¶
func NewClient(options ...ClientOption) Client
NewClient returns a fully initialized Client object
func (Client) Load ¶
Load retrieves a document from a remote server and returns it as a streams.Document It uses either the "Actor" or "Document" methods of generating it ActivityStreams result. "Document" treats the URL as a single ActivityStreams document, translating OpenGraph, MicroFormats, and JSON-LD into an ActivityStreams equivalent. "Actor" treats the URL as an Actor, translating RSS, Atom, JSON, and MicroFormats feeds into an ActivityStream equivalent.
func (Client) SetRootClient ¶ added in v0.8.12
func (*Client) With ¶ added in v0.8.12
func (client *Client) With(options ...ClientOption)
type ClientOption ¶ added in v0.6.0
type ClientOption func(*Client)
func WithKeyPairFunc ¶ added in v0.8.12
func WithKeyPairFunc(fn KeyPairFunc) ClientOption
WithKeyPairFunc is an Option that sets the ActorGetter for a Client. This allows the Client to retrieve the public key ID and private key for a given URL only when needed, rather than performing expensive database queries ahead of time.
func WithUserAgent ¶ added in v0.6.0
func WithUserAgent(userAgent string) ClientOption
WithUserAgent is a ClientOption that sets the UserAgent property on the Client object
type KeyPairFunc ¶ added in v0.8.12
type KeyPairFunc func() (publicKeyID string, privateKey crypto.PrivateKey)
type Option ¶ added in v0.8.12
type Option func(*Config)
func AsActor ¶ added in v0.6.0
func AsActor() Option
AsActor tells Sherlock to try parsing the URL as an Actor object.
func AsCollection ¶ added in v0.6.0
func AsCollection() Option
AsCollection tells Sherlock to try parsing the URL as a Collection object
func AsDocument ¶ added in v0.6.0
func AsDocument() Option
AsDocument tells Sherlock to try parsing the URL as a Document object
func WithDefaultValue ¶ added in v0.6.0
WithDefaultValue is an Option that sets the DefaultValue, which is used as the base value for all documents loaded by the Client.
func WithKeyPair ¶ added in v0.8.12
func WithKeyPair(publicKeyID string, privateKey crypto.PrivateKey) Option
WithKeyPair is an Option that set up the AuthorizedFetch remote middleware, which will sign all outbound requests according to the ActivityPub "Authorized Fetch" convention: https://funfedi.dev/testing_tools/http_signatures/
func WithMaximumRedirects ¶ added in v0.6.0
WithMaximumRedirects is an Option that sets the maximum number of redirects that the Client will follow when loading a document.
func WithRemoteOptions ¶ added in v0.6.0
WithRemoteOptions is an Option that adds remote.Options which are passed to the remote library when making requests.
Source Files
¶
- actor-.go
- actor-WebFinger.go
- actor-activityStreams.go
- actor-feed-.go
- actor-feed-JSON.go
- actor-feed-RSS.go
- actor-feed-icon.go
- actor-feed-links.go
- actor-feed-microFormats.go
- authorized-fetch.go
- client-.go
- client-applyLinks.go
- constants.go
- document-.go
- document-activityStream.go
- document-html-.go
- document-html-jsonld-.go
- document-html-jsonld-embedded.go
- document-html-jsonld-linked.go
- document-html-microformats.go
- document-html-oembed.go
- document-html-opengraph.go
- document-html-wordpress.go
- interfaces.go
- options-client.go
- options-load.go
- sherlock-extras.go
- sherlock.go
- utils.go