Identify ourselves when fetching files from an outside site

A growing number of websites have started to block generic python-requests user agents but do not block well meaning and identified unique user agents. Similar to identifying ourselves in cross client calls, we should also identify ourselves when a user provides a URL to fetch and import into DocumentCloud. We should also add some logging mechanism to be able to audit this and detect when sites are blocking users from importing their documents into DocumentCloud. 

This serves as the dual function to be more reliable and also to perhaps file a public records request on why we're blocked given we host primary source materials for free to the public. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify ourselves when fetching files from an outside site #414

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Identify ourselves when fetching files from an outside site #414

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions