Python

Python - Reference

errors in Requests

error type safe to retry examples of when error occurs
HTTPError - 4xx No client error, e.g., calling the api incorrectly
HTTPError - 5xx No server error, e.g. bugs in server code
Connection Error Maybe incorrect configurations or unavailble network
Connection Timeout Yes network congestion, server too busy to responds, server being offline
Request Timeout Maybe The server did not send any data in the allotted amount of time
Requests library Exception No data passed in cannot be converted to json
Other Python Exception No any python errors

https://requests.readthedocs.io/en/latest/api/#requests.ReadTimeout

See also: requests doc

Catching requests errors

from requests import Timeout, HTTPError

response = requests.post(url, headers=headers, data=data)
try:
    response.raise_for_status()
except HTTPError, Timeout as error:
    # TODO

In the event of a network problem (e.g. DNS failure, refused connection, etc), Requests will raise a ConnectionError exception. Response.raise_for_status() will raise an HTTPError if the HTTP request returned an unsuccessful status code. If a request times out, a Timeout exception is raised. If a request exceeds the configured number of maximum redirections, a TooManyRedirects exception is raised. All exceptions that Requests explicitly raises inherit from requests.exceptions.RequestException.

I argue that we should not catch connection error in user facing code as when the code deployed to prod, the code should not have connection errors that happens due to incorrect network configs?

Error Classes Inheritance

RequestException(IOError)
---> HTTPError(RequestException)
---> ConnectionError(RequestException)
---> Timeout(RequestException)
-------> ConnectTimeout(ConnectionError, Timeout)
-------> ReadTimeout(Timeout)

Python - How To

debug in ssh

# first shell
# somehow this is needed for `ipdb` to run correctly
> %pdb

# first shell
> import pdb
> pdb.runcall(function_foo, ...any args and kwargs)

get file path to python source codes in iPython with ??

in iPython

>>> import requests
>>> requests??
>>> import this
>>> this??

get celery ran tasks

from django_celery_results.models import TaskResult
from pprint import pprint

# get tasks by recency
task_result = TaskResult.objects.order_by('-date_created')[:5]
for i in task_result:
    pprint(vars(i))

# get failed tasks
task_result = TaskResult.objects.filter(status='FAILURE').order_by('date_done')[:10]
for i in task_result:
    pprint(vars(i))

find the dependencies of python package

Two ways

  1. Go into a shell with the dependencies installed
In [1]:  from importlib.metadata import requires

In [2]: requires('Django')
Out[2]:
['asgiref (<4,>=3.3.2)',
 'pytz',
 'sqlparse (>=0.2.2)',
 "argon2-cffi (>=19.1.0) ; extra == 'argon2'",
 "bcrypt ; extra == 'bcrypt'"]
  1. Read the poetry.lock file
[[package]]
name = "django"
version = "3.2.20"
description = "A high-level Python Web framework that encourages rapid development and clean, pragmatic design."
optional = false
python-versions = ">=3.6"
files = [
    {file = "Django-3.2.20-py3-none-any.whl", hash = "sha256:a477ab326ae7d8807dc25c186b951ab8c7648a3a23f9497763c37307a2b5ef87"},
    {file = "Django-3.2.20.tar.gz", hash = "sha256:dec2a116787b8e14962014bf78e120bba454135108e1af9e9b91ade7b2964c40"},
]

[package.dependencies]
asgiref = ">=3.3.2,<4"
pytz = "*"
sqlparse = ">=0.2.2"

[package.extras]
argon2 = ["argon2-cffi (>=19.1.0)"]
bcrypt = ["bcrypt"]

profile performance in iPython

`%%prun -r returns profiling object

%%prun -D stats

from oneview.models.helpers.imports import pull_accounts_no_threaded
pull_accounts_no_threaded()
import pstats
from pstats import SortKey

stats = pstats.Stats('stats')

# `tottime` for the total time spent in the given function (and excluding time
# made in calls to sub-functions)
stats.sort_stats('tottime').print_stats(50)
stats.sort_stats('cumtime').print_stats(50)
stats.sort_stats('filename').print_stats(50)

Alternatively, Just timeit once

%%timeit -n 1 -r 1

from oneview.models.helpers.imports import pull_accounts, pull_accounts_no_threaded
pull_accounts()

See also - pstat sort keys - %prun documentation - %timeit documentation

run black or flake8 on python code in markdown

generalised to any files

  1. vic visual select lines of python codes in markdown code block
  2. :'<,'>!black -q - to replace the visually selected lines with the formatted standard output of the external black command (see :help filter)
  3. :'<,'>w !flake8 - to use w to echo flake8’s output instead of replacing visually selected lines (see :help :w_c)

Alternatively, I can open the output in a split

  1. :'<,'>w !flake8 - > quickfix.vim to pipe the output into quickfix.vim
  2. <leader>el

use pdb for debugging

Debugging after ssh into uat

Prerequisite

Go into the ipython shell e.g. sudo docker exec -it oneview-django poetry run python manage.py shell

In [1]: from oneview.graphql.api.charge import one_fee_calculator

In [2]: review_id = "73a07db5-7214-4ead-a9b8-4906e1727a8c"

In [3]: import pdb
        pdb.runcall(one_fee_calculator, review_id)
[1] > /app/backend/oneview/graphql/api/charge.py(248)one_fee_calculator()
-> review = Review.objects.get(id=review_id)

If you do the above, you will be dropped into the pdb debugging shell.

Note: please be careful about possible side effects if the functions called especially if you are doing this in uat or even prod environment.

pdb

pdb is an interactive source code debugger for Python programs.

pdb is very powerful, though you need to get familiar with the commands

You can find the pdb commands doc here

Most useful commands I find are

set timeout in Requests

# timeout = (connect timeout, read timeout)
requests.get('https://github.com', timeout=(3, 27))

By default, requests do not time out unless a timeout value is set explicitly. Without a timeout, your code may hang for minutes or more. It’s a good practice to set connect timeouts to slightly larger than a multiple of 3, which is the default TCP packet retransmission window. Doc from requests

use urllib instead Requests

POST with json data

json_as_bytes = json.dumps(payload).encode("utf-8")
request_object = request.Request(self.api_url, data=json_as_bytes)
request_object.add_header("Content-Type", "application/json")
request_object.add_header("Authorization", f"Bearer {self.token}")
try:
    with request.urlopen(request_object) as response:
        res_json = json.loads(response.read())
except HTTPError as e:
    _logger.exception(e.code)
    _logger.exception(e.read())
    raise e

See Also: urllib2 doc

use a decorator to print function calls

def print_function_name(func):
    def _print_function_name(*args, **kwargs):
        print(f"--> begin:  {func.__name__}")
        result = func(*args, **kwargs)
        print(f"--> return: {func.__name__}")
        return result
    return _print_function_name

enable iPython autoload

reload modules when user executes code, so that I don’t need to exit ipython shell to reload edited code.

%load_ext autoreload
%autoreload 2

Use logging module

Usage

import logging

logging.basicConfig(
    # filename="foo.log",
    encoding="utf-8",
    level=logging.DEBUG,
    format='%(levelname)s %(asctime)s %(module)s %(message)s',
)

logger = logging.getLogger(__name__)
logger.debug("foo bar baz")

Log Level Explain

Level When it’s used
DEBUG Detailed information, typically of interest only when diagnosing problems.
INFO Confirmation that things are working as expected.
WARNING An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
ERROR Due to a more serious problem, the software has not been able to perform some function.
CRITICAL A serious error, indicating that the program itself may be unable to continue running.

Loggers, Handlers, Filters and Formatters

The logging library takes a modular approach and offers several categories of components: loggers, handlers, filters, and formatters.

Module Function
Loggers expose the interface that application code directly uses.
Handlers send the log records (created by loggers) to the appropriate destination.
Filters provide a finer grained facility for determining which log records to output.
Formatters specify the layout of log records in the final output.

More on Loggers

getLogger() returns a reference to a logger instance with the specified name if it is provided, or root if not. The names are period-separated hierarchical structures. Multiple calls to getLogger() with the same name will return a reference to the same logger object. Loggers that are further down in the hierarchical list are children of loggers higher up in the list. For example, given a logger with a name of foo, loggers with names of foo.bar, foo.bar.baz, and foo.bam are all descendants of foo.

Loggers have a concept of effective level. If a level is not explicitly set on a logger, the level of its parent is used instead as its effective level. If the parent has no explicit level set, its parent is examined, and so on - all ancestors are searched until an explicitly set level is found. The root logger always has an explicit level set (WARNING by default). When deciding whether to process an event, the effective level of the logger is used to determine whether the event is passed to the logger’s handlers.

Child loggers propagate messages up to the handlers associated with their ancestor loggers. Because of this, it is unnecessary to define and configure handlers for all the loggers an application uses. It is sufficient to configure handlers for a top-level logger and create child loggers as needed. (You can, however, turn off propagation by setting the propagate attribute of a logger to False.)

If no configuration is used

If no logging configuration is used, a special handler lastResort is used, which writes log to stderr with level WARNING with formatting like level:module:message.

import logging

logging.debug("abc")
logging.info("abc")
logging.warning("abc")
logging.error("abc")

# Output
# WARNING:root:abc
# ERROR:root:abc

logger = logging.getLogger(__name__)
logger.debug("abc")
logger.info("abc")
logger.warning("abc")
logger.error("abc")

# Output
# WARNING:__main__:abc
# ERROR:__main__:abc

See also

what is python wheel?

output for installing source distribution (not wheel)

downloading tar.gz and building wheel

> python -m pip install 'uwsgi==2.0.*'

Collecting uwsgi==2.0.*
  Downloading uwsgi-2.0.22.tar.gz (809 kB)
     ---------------------------------------- 809.7/809.7 kB 13.4 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: uwsgi
  Building wheel for uwsgi (setup.py) ... done
  Created wheel for uwsgi: filename=uWSGI-2.0.22-cp311-cp311-macosx_13_0_arm64.whl size=400536 sha256=a79b882b505a3093feed13f859dfa01e1ce04651abd125d418509505bc861d94
  Stored in directory: /Users/yuhao.huang/Library/Caches/pip/wheels/93/59/2d/d21852a9f9607e9494b5d3c96d11f348d11039f7c47223c9ce
Successfully built uwsgi
Installing collected packages: uwsgi
Successfully installed uwsgi-2.0.22

output for installing wheel

there’s no build stage when pip finds a compatible wheel on PyPI.

> python -m pip install 'chardet==3.*'
Collecting chardet==3.*
  Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
     ---------------------------------------- 133.4/133.4 kB 4.8 MB/s eta 0:00:00
Installing collected packages: chardet
Successfully installed chardet-3.0.4

A Python .whl file is essentially a ZIP (.zip) archive with a specially crafted filename that tells installers what Python versions and platforms the wheel will support.

{dist}-{version}(-{build})?-{python}-{abi}-{platform}.whl

cryptography-2.9.2-cp35-abi3-macosx_10_9_x86_64.whl

fileobj vs string vs byte string