Python - urllib Examples

Urllib. A Python program cannot access a file on the Internet directly. It must instead make an external request. With urllib we make these requests—we download external files.

Urlopen notes. By calling urlopen in a for-loop, we can access each line of a web page. This can be useful when scraping for data on external sites.

for

Urlopen. First we must import, from the urllib library, the urlopen method. This is the first line of the Python file. The program next calls urlopen().

Note The argument is the location of the web page. We next call decode() on the line. This fixes some of the data.

Tip The last argument to print is end="". This fixes some programs with double line breaks at the end of lines.

from urllib.request import urlopen

# Print first four lines of this site.
i = 0
for line in urlopen("http://www.example.com/"):
    # Decode.
    line = line.decode()

    # Print.
    print(i, line, end="")

    # See if past limit.
    if i == 3:
        break
    i += 10 <!doctype html>
1 <html>
2 <head>
3     <title>Example Domain</title>

Parse. Web locations usually begin in http or https—these are called URLs or URIs. In Python we use the urllib.parse module to access the urlparse type.

Here Here we parse a URL. Then we access some fields from the parsed URL object result.

Next The scheme is the "http" part, with no punctuation. The netloc is the domain, with no leading or trailing punctuation.

Finally The path is the location on the domain. We use it on the root page here, so the path is a forward-slash "/."

from urllib.parse import urlparse

# Parse this url.
result = urlparse("http://www.example.com/")

# Get some values from the ParseResult.
scheme = result.scheme
loc = result.netloc
path = result.path

# Print our values.
print(scheme)
print(loc)
print(path)http
www.example.com
/

Summary. Python can fetch external files or web pages. But the complexity of programs increases when external files are necessary—sometimes external files cause errors.

Dot Net Perls is a collection of pages with code examples, which are updated to stay current. Programming is an art, and it can be learned from examples.

Donate to this site to help offset the costs of running the server. Sites like this will cease to exist if there is no financial support for them.

Sam Allen is passionate about computer languages, and he maintains 100% of the material available on this website. He hopes it makes the world a nicer place.

This page was last updated on Mar 21, 2024 (edit).

Home

Changes