Chapter 02 - The Python Language

02. The Python Language

Intro

To learn Python from scratch, we suggest you start with the appropriate links from the online docs/tutorial.

Python style guide, use PEP 8 (optionally augmented by extra guidelines such as The Hitchhiker’s Guide, CKAN’s, and Google’s).

Lexical Structure

The lexical structure of a programming language is the set of basic rules that govern how you write programs in that language. It is the lowest-level syntax of the language, specifying such things as what variable names look like and how to denote comments. Each Python source file, like any other text file, is a sequence of characters. You can also usefully consider it a sequence of lines, tokens, or statements. These different lexical views complement each other.

Python is very particular about program layout, especially lines and indentation.

Lines and Indentation

Python programs are made up of a sequence of logical lines, which are groups of one or more physical lines. Logical lines are separated by blank lines or by the end of a physical line. Python uses indentation to indicate the block structure of a program, rather than using braces or other delimiters. All statements in a block must have the same indentation. The first statement in a source file must have no indentation, and statements typed at the interactive interpreter must also have no indentation.

Character Sets

Tokens

Python breaks each logical line into a sequence of elementary lexical components known as tokens. Each token corresponds to a substring of the logical line. The normal token types are identifiers, keywords, operators, delimiters, and literals.

Identifiers

An identifier is a name used to specify a variable, function, class, module, or other object. An identifier starts with a letter (any character that Unicode classifies as a letter) or an underscore (_), followed by zero or more letters, underscores, digits or other characters that Unicode classifies as digits or combining marks (as defined in Unicode® Standard Annex #31).

Keywords

Python has 35 keywords, which are identifiers that Python reserves for special syntactic uses. Like identifiers, keywords are case-sensitive. You cannot use keywords as regular identifiers (thus, they’re sometimes known as “reserved words”). You can list them by importing the keyword module and printing keyword.kwlist.

>>>import keyword
>>>keyword.kwlist
['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 
    'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 
    'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']

In addition, Python 3.9 introduced the concept of soft keywords. Soft keywords are keywords that are context-sensitive. They are language keywords for some specific syntax constructs, but outside of those constructs they may be used as variable or function names, so they are not reserved words. No soft keywords were defined in Python 3.9, just in Python 3.10. You can list them from the keyword module by printing keyword.softkwlist.

>>>keyword.softkwlist
['_', 'case', 'match']

Operators

Python uses non-alphanumeric characters and character combinations as operators. Python recognizes the following operators, which are covered in detail in “Expressions and Operators”:

+  -  *  /  %   **  //  <<  >>  &   @
|  ^  ~  <  <=  >   >=  <>  !=  ==  @=  :=

You can use @ as an operator (in matrix multiplication, covered in Chapter 15), although the character is technically a delimiter.

Delimiters

Python uses the following characters and combinations as delimiters in expressions, list, dictionary, and set literals and comprehensions, and various statements, among other purposes:

(    )    [    ]    {    }
,    :    .    `    =    ;   @
+=   -=   *=   /=   //=  %=
&=   |=   ^=   >>=  <<=  **=

The period (.) can also appear in floating-point literals (e.g., 2.3) and imaginary literals (e.g., 2.3j). The last two rows are the augmented assignment operators, which are delimiters, but also perform operations. We discuss the syntax for the various delimiters when we introduce the objects or statements using them.

The following characters have special meanings as part of other tokens:

'  "  #  \

' and " surround string literals. # outside of a string starts a comment, which ends at the end of the current line. \ at the end of a physical line joins the following physical line into one logical line; \ is also an escape character in strings. The characters $ and ?, and all control characters1 except whitespace, can never be part of the text of a Python program, except in comments or string literals.

Literals

A literal is the direct denotation in a program of a data value (a number, string, or container). The following are number and string literals in Python:

42                       # Integer literal
3.14                     # Floating-point literal
1.0j                     # Imaginary literal
'hello'                  # String literal
"world"                  # Another string literal
"""Good
night"""                 # Triple-quoted string literal, spanning 2 lines

Combining number and string literals with the appropriate delimiters, you can build literals that directly denote data values of container types:

[42, 3.14, 'hello']     # List
[]                      # Empty list
100, 200, 300           # Tuple
()                      # Empty tuple
{'x':42, 'y':3.14}      # Dictionary
{}                      # Empty dictionary
{1, 2, 4, 8, 'string'}  # Set

# There is no literal to denote an empty set; use set() instead

We cover the syntax for literals in detail in “Data Types”, when we discuss the various data types Python supports.

Statements

You can look at a Python source file as a sequence of simple and compound statements.

Simple statements

A simple statement is one that contains no other statements. A simple statement lies entirely within a logical line. As in many other languages, you may place more than one simple statement on a single logical line, with a semicolon (;) as the separator. However, one statement per line is the usual and recommended Python style, and makes programs more readable.

Any expression can stand on its own as a simple statement (we discuss expressions in “Expressions and Operators”). When working interactively, the interpreter shows the result of an expression statement you enter at the prompt (>>>) and binds the result to a global variable named _ (underscore). Apart from interactive sessions, expression statements are useful only to call functions (and other callables) that have side effects (e.g., perform output, change global variables, or raise exceptions).

An assignment is a simple statement that assigns values to variables, as we discuss in “Assignment Statements”. An assignment in Python using the = operator is a statement and can never be part of an expression. To perform an assignment as part of an expression, you must use the := (jokingly known as the “walrus”) operator.

Compound statements

A compound statement contains one or more other statements and controls their execution. A compound statement has one or more clauses, aligned at the same indentation. Each clause has a header starting with a keyword and ending with a colon (:), followed by a body, which is a sequence of one or more statements. Normally, these statements, also known as a block, are on separate logical lines after the header line, indented four spaces rightward. The block lexically ends when the indentation returns to that of the clause header (or further left from there, to the indentation of some enclosing compound statement). Alternatively, the body can be a single simple statement, following the : on the same logical line as the header. The body may also consist of several simple statements on the same line with semicolons between them, but, as we’ve already mentioned, this is not good Python style.


Data Types

The operation of a Python program hinges on the data it handles. Data values in Python are known as objects; each object, AKA value, has a type. An object’s type determines which operations you can perform on the value.

An object that can be altered is known as a mutable object, while one that cannot be altered is an immutable object.

The built-in type(obj) accepts any object as its argument and returns the type object that is the type of obj. The built-in function isinstance(obj, type) returns True when object obj has type type (or any subclass thereof); otherwise, it returns False.

Primitive Data Structures:

These are the most primitive or basic data structures. They are the building blocks for data manipulation and contain pure, simple values of data.

Non-Primitive Data Structures:

Non-primitive types are the sophisticated members of the data structure family. They don’t just store a value, but rather a collection of values in various formats.

Mutability vs Immutability:

In programming, you have an immutable object if you can’t change the object’s state after you’ve created it. In contrast, a mutable object allows you to modify its internal state after creation. In short, whether you’re able to change an object’s state or contained data is what defines if that object is mutable or immutable.

Immutable objects are common in functional programming, while mutable objects are widely used in object-oriented programming. Because Python is a multiparadigm programming language, it provides mutable and immutable objects for you to choose from when solving a problem. REF: realpython.com

Linear vs Non-Linear Data Structures:

Each element in a linear data structure only has one predecessor and one successor, and the elements are kept in a linear or sequential order. Arrays, linked lists, stacks, and queues are a few examples of linear data structures.

On the other hand, non-linear data structures are those in which the elements are stored in a hierarchical or tree-like structure rather than in a linear or sequential fashion. The non-linear data structures heaps, trees, and graphs are a few examples.

The way the elements are arranged and accessible is the primary distinction between linear and non-linear data structures. In linear data structures, elements are accessed sequentially and in order, one after the other. Depending on the particular structure, there are several ways to retrieve elements in non-linear data structures.

Variables and Other References

A Python program accesses data values through references. A reference is a “name” that refers to a value (object). References take the form of variables, attributes, and items. In Python, a variable or other reference has no intrinsic type. The object to which a reference is bound at a given time always has a type, but a given reference may be bound to objects of various types in the course of the program’s execution.

Expressions and Operators

WIP.

Set Operations

WIP.

Control Flow Statements

WIP.

The Math module

This topic is a collection of subjects that are at somewhat mathematical in nature, though many of them are of general interest. As mentioned in before, the math module contains some common math functions. Here are most of them:

Function Description
sin, cos, tan trig functions
asin, acos, atan inverse trig functions
atan2(y,x) gives arctan $(y/x)$ with proper sign behavior
sinh, cosh, tanh hyperbolic functions
asinh, acosh, atanh inverse hyperbolic functions
log, log10 natural log, log base 10
log1p log(1+x), more accurate near 1 than log
exp exponential function $e^x$
degrees, radians convert from radians to degrees or vice-versa
floor floor(x) is the greatest integer $\leq x$
ceil ceil(x) is the least integer $\geq x$
e, pi the constants e and $\pi$
factorial factorial
modf returns a pair (fractional part, integer part)
gamma, erf the $\Gamma$ function and the Error function

Note Note that the floor and int functions behave the same for positive numbers, but differently for negative numbers. For instance, floor(3.57) and int(3.57) both return the integer 3. However, floor(-3.57) returns -4, the greatest integer less than or equal to -3.57, while int(-3.57) returns -3, which is obtained by throwing away the decimal part of the number

Scientific notation

Look at the following code:

>>> 100.1**10
1.0100451202102516e+20

The resulting value is displayed in scientific notation. It is $1.0100451202102516 \times 10^20$. The e+20 stands for $\times 10^{20}$. Here is another example:

>>> .15**10
5.7665039062499975e-0

This is $5.7665039062499975 \times 10^{−9}$.

results matching ""

    No results matching ""