URL normalization for Python (python3)
Project description
url-normalize
=============
URI Normalization function:
* Take care of IDN domains.
* Always provide the URI scheme in lowercase characters.
* Always provide the host, if any, in lowercase characters.
* Only perform percent-encoding where it is essential.
* Always use uppercase A-through-F characters when percent-encoding.
* Prevent dot-segments appearing in non-relative URI paths.
* For schemes that define a default authority, use an empty authority if the default is desired.
* For schemes that define an empty path to be equivalent to a path of "/", use "/".
* For schemes that define a port, use an empty port if the default is desired
* All portions of the URI must be utf-8 encoded NFC from Unicode strings
Inspired by Sam Ruby's urlnorm.py: http://intertwingly.net/blog/2004/08/04/Urlnorm
Example:
```
$ pip install git+git://github.com/niksite/url-normalize.git
Collecting git+git://github.com/niksite/url-normalize.git
Cloning git://github.com/niksite/url-normalize.git to /tmp/pip-trXUik-build
Installing collected packages: url-normalize
Running setup.py install for url-normalize
Successfully installed url-normalize-1.2
$ python
Python 2.7.11 (default, Dec 8 2015, 23:51:37)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
> from url_normalize import url_normalize
> print url_normalize('www.foo.com:80/foo')
> http://www.foo.com/foo
```
History:
* 05 Jan 2016: Python 3 compatibility
* 29 Dec 2015: PEP8, setup.py
* 10 Mar 2010: support for shebang (#!) urls
* 28 Feb 2010: using 'http' schema by default when appropriate
* 28 Feb 2010: added handling of IDN domains
* 28 Feb 2010: code pep8-zation
* 27 Feb 2010: forked from Sam Ruby's urlnorm.py
=============
URI Normalization function:
* Take care of IDN domains.
* Always provide the URI scheme in lowercase characters.
* Always provide the host, if any, in lowercase characters.
* Only perform percent-encoding where it is essential.
* Always use uppercase A-through-F characters when percent-encoding.
* Prevent dot-segments appearing in non-relative URI paths.
* For schemes that define a default authority, use an empty authority if the default is desired.
* For schemes that define an empty path to be equivalent to a path of "/", use "/".
* For schemes that define a port, use an empty port if the default is desired
* All portions of the URI must be utf-8 encoded NFC from Unicode strings
Inspired by Sam Ruby's urlnorm.py: http://intertwingly.net/blog/2004/08/04/Urlnorm
Example:
```
$ pip install git+git://github.com/niksite/url-normalize.git
Collecting git+git://github.com/niksite/url-normalize.git
Cloning git://github.com/niksite/url-normalize.git to /tmp/pip-trXUik-build
Installing collected packages: url-normalize
Running setup.py install for url-normalize
Successfully installed url-normalize-1.2
$ python
Python 2.7.11 (default, Dec 8 2015, 23:51:37)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
> from url_normalize import url_normalize
> print url_normalize('www.foo.com:80/foo')
> http://www.foo.com/foo
```
History:
* 05 Jan 2016: Python 3 compatibility
* 29 Dec 2015: PEP8, setup.py
* 10 Mar 2010: support for shebang (#!) urls
* 28 Feb 2010: using 'http' schema by default when appropriate
* 28 Feb 2010: added handling of IDN domains
* 28 Feb 2010: code pep8-zation
* 27 Feb 2010: forked from Sam Ruby's urlnorm.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
url-normalize-1.3.1.tar.gz
(4.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file url-normalize-1.3.1.tar.gz.
File metadata
- Download URL: url-normalize-1.3.1.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9740a8297482d0ab1032f1644e157a0f9ad289e0b07ab36bd3e6d83aaff65230
|
|
| MD5 |
e22fc4690c2829ca8518f4dad9278f7c
|
|
| BLAKE2b-256 |
a0ac8fe1031e9fe9a4fa27ff355d9e224c731197cdb8f7eee4930a73241bc35e
|
File details
Details for the file url_normalize-1.3.1-py3-none-any.whl.
File metadata
- Download URL: url_normalize-1.3.1-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef6c035914e5ca8d459c6403d09191b4a1b84687ab2cade571f9f30035d60467
|
|
| MD5 |
f9beb82bc31e403074a77453e557b9d5
|
|
| BLAKE2b-256 |
496e62d24f5837528c929232424ccd015bab6820b3b8b173ec0f7bd80be70e0d
|