< Python Programming

This lesson introduces Python string processing.

Objectives and Skills

Objectives and skills for this lesson include:

  • Standard Library
    • String operations

Readings

  1. Wikipedia: String (computer science)
  2. Python for Everyone: Strings

Multimedia

  1. YouTube: Python for Informatics - Chapter 6 - Strings
  2. YouTube: Python - Strings
  3. YouTube: Python - String Slicing
  4. YouTube: Python - String Formatting
  5. YouTube: Python Strings Playlist

Examples

len() function

The len() function returns the length (the number of items) of an object.[1]

string = "Test"

print("len(string):", len(string))

Output:

len(string): 4

Strings

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.[2]

string = "Test"

print("Characters:")
for letter in string:
    print(letter)

print("\nCharacters by position:")
for i in range(len(string)):
    print("string[%d]: %c" % (i, string[i]))

Output:

Characters:
T
e
s
t

Characters by position:
string[0]: T
string[1]: e
string[2]: s
string[3]: t

Membership Comparisons

The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.[3]

alphabet = "abcdefghijklmnopqrstuvwxyz"
string = "Python Programming/Strings"

print("The string contains:")
for letter in alphabet:
    if letter in string.lower():
        print(letter)

Output:

The string contains:
a
g
h
i
m
n
o
p
r
s
t
y

String Methods

Strings implement all of the common sequence operations, along with additional methods such as case validation and conversion.[4]

string = "Test"

print("string:", string)
print("string.isalpha():", string.isalpha())
print("string.islower():", string.islower())
print("string.isnumeric():", string.isnumeric())
print("string.isspace():", string.isspace())
print("string.istitle():", string.istitle())
print("string.isupper():", string.isupper())
print("string.lower():", string.lower())
print("string.strip():", string.strip())
print("string.swapcase():", string.swapcase())
print("string.title():", string.title())
print("string.upper():", string.upper())

Output:

string: Test
string.isalpha(): True
string.islower(): False
string.isnumeric(): False
string.isspace(): False
string.istitle(): True
string.isupper(): False
string.lower(): test
string.strip(): Test
string.swapcase(): tEST
string.title(): Test
string.upper(): TEST

String Parsing

Python substrings are referred to as slices. Slices are accessed using the syntax string[start:end], with the first character index starting at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string. String slices may use negative indexing, in which case the index counts backwards from the end of the string.[5] The find() method returns the lowest index in the string where a substring is found within the given slice. Returns -1 if the substring is not found.[6]

string = "Python Programming/Strings"
index = string.find("/")

if index >= 0:
    project = string[0:index]
    page = string[index + 1:]

    print("Project:", project)
    print("Page:", page)

Output:

Project: Python Programming
Page: Strings

String Formatting

String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.[7]

print("Value:", value)
print("Integer: %i" % value)
print("Octal: %o" % value)
print("Hexadecimal: %x" % value)
print("Float: %.2f" % value)
print("Exponent: %.2e" % value)
print("Character: %c" % value)
print("String: %s" % value)
print("Multiple: %i, %o, %x, %.2f, %.2e, %c, %s" % 
    (value, value, value, value, value, value, value))

Output:

Value: 65.5
Integer: 65
Octal: 101
Hexadecimal: 41
Float: 65.50
Exponent: 6.55e+01
Character: A
String: 65.5
Multiple: 65, 101, 41, 65.50, 6.55e+01, A, 65.5

str.format() Method

The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.[8]

integer = 65
float = 65.5
string = "65.5"

print("Decimal: {:d}".format(integer))
print("Binary: {:b}".format(integer))
print("Octal: {:o}".format(integer))
print("Hexadecimal: {:x}".format(integer))
print("Character: {:c}".format(integer))
print("Float: {:.2f}".format(float))
print("Exponent: {:.2e}".format(float))
print("String: {:s}".format(string))
print("Multiple: {:d}, {:b}, {:o}, {:x}, {:c}, {:.2f}, {:.2e}, {:s}".format(
    integer, integer, integer, integer, integer, float, float, string))

Output:

Decimal: 65
Binary: 1000001
Octal: 101
Hexadecimal: 41
Character: A
Float: 65.50
Exponent: 6.55e+01
String: 65.5
Multiple: 65, 1000001, 101, 41, A, 65.50, 6.55e+01, 65.5


String Interpolation

Introduced in Python 3.6, string interpolation formats strings that contain “replacement fields” surrounded by curly braces {}, similar to the str.format() method.[9] However, an uppercase or lowercase 'f' is placed before the string to indicate string formatting:

integer = 65
float = 65.5
string = "65.5"

print(f"Decimal: {integer:d}")
print(f"Binary: {integer:b}")
print(f"Octal: {integer:o}")
print(f"Hexadecimal: {integer:x}")
print(f"Character: {integer:c}")
print(f"Float: {float:.2f}")
print(f"Exponent: {float:.2e}")
print(f"String: {string:s}")
print(f"Multiple: {integer:d}, {integer:b}, {integer:o}, {integer:x}, {integer:c}, {float:.2f}, {float:.2e}, {string:s}")

Output:

Decimal: 65
Binary: 1000001
Octal: 101
Hexadecimal: 41
Character: A
Float: 65.50
Exponent: 6.55e+01
String: 65.5
Multiple: 65, 1000001, 101, 41, A, 65.50, 6.55e+01, 65.5

Activities

Tutorials

  1. Complete one or more of the following tutorials:

Practice

  1. Review Python.org: String methods. Create a Python program that asks the user for a line of text containing a first name and last name. Use string methods to parse the line and print out the name in the form last name, first initial, such as Lastname, F. Include a trailing period after the first initial. Ensure that the first letter of each name part is capitalized, and the rest of the last name is lower case. Include error handling in case the user does not enter exactly two name parts. Use a user-defined function for the actual string processing, separate from input and output.
  2. Review Python.org: String methods. Create a Python program that asks the user for a line of comma-separated-values. It could be a sequence of test scores, names, or any other values. Use string methods to parse the line and print out each item on a separate line. Remove commas and any leading or trailing spaces from each item when printed. If the item is numeric, display it formatted as a floating point value with two decimal places. If the value is not numeric, display it as is.
  3. Review Python.org: String methods. Create a Python program that asks the user for a line of text that contains HTML tags, such as:
        <p><strong>This is a bold paragraph.</strong></p>
    Use string methods to search for and remove all HTML tags, and then print the remaining untagged text. Include error handling in case an HTML tag isn't entered correctly (an unmatched < or >). Use a user-defined function for the actual string processing, separate from input and output.
  4. Review Python.org: String methods. Create a Python program that asks the user for a line of text. Then ask the user for the number of characters to print in each line, the number of lines to be printed, and a scroll direction, right or left. Using the given line of text, duplicate the text as needed to fill the given number of characters per line. Then print the requested number of lines, shifting the entire line's content by one character, left or right, each time the line is printed. The first or last character will be shifted / appended to the other end of the string. For example:
        Repeat this. Repeat this. 
        epeat this. Repeat this. R
        peat this. Repeat this. Re

Lesson Summary

  • A string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.[10]
  • A string variable may allow its elements to be mutated and the length changed, or it may be fixed (after creation).[11]
  • Python strings are immutable — they cannot be changed.[12]
  • The len() function returns the length (the number of items) of an object.[13]
  • Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.[14]
  • The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.[15]
  • String methods include: isalpha(), islower(), isnumeric(), isspace(), istitle(), isupper(), lower(), strip(), swapcase(), title(), and upper().[16]
  • Python substrings are referred to as slices. Slices are accessed using the syntax string[start:end], with the first character index starting at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string.[17]
  • String slices may use negative indexing, in which case the index counts backwards from the end of the string.[18]
  • The find() method returns the lowest index in the string where a substring is found within the given slice. Returns -1 if the substring is not found.[19]
  • String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.[20]
  • The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.[21]

Key Terms

counter
A variable used to count something, usually initialized to zero and then incremented.[22]
empty string
A string with no characters and length 0, represented by two quotation marks.[23]
format operator
An operator, %, that takes a format string and a tuple and generates a string that includes the elements of the tuple formatted as specified by the format string.[24]
format sequence
A sequence of characters in a format string, like %d, that specifies how a value should be formatted.[25]
format string
A string, used with the format operator, that contains format sequences.[26]
flag
A boolean variable used to indicate whether a condition is true.[27]
invocation
A statement that calls a method.[28]
immutable
The property of a sequence whose items cannot be assigned.[29]
index
An integer value used to select an item in a sequence, such as a character in a string.[30]
item
One of the values in a sequence.[31]
method
A function that is associated with an object and called using dot notation.[32]
object
Something a variable can refer to. For now, you can use “object” and “value” interchangeably.[33]
search
A pattern of traversal that stops when it finds what it is looking for.[34]
sequence
An ordered set; that is, a set of values where each value is identified by an integer index.[35]
slice
A part of a string specified by a range of indices.[36]
traverse
To iterate through the items in a sequence, performing a similar operation on each.[37]

Review Questions

  1. A string is _____.
    A string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.
  2. A string variable may _____.
    A string variable may allow its elements to be mutated and the length changed, or it may be fixed (after creation).
  3. Python strings are _____ — they cannot be changed.
    Python strings are immutable — they cannot be changed.
  4. The len() function returns _____.
    The len() function returns the length (the number of items) of an object.
  5. Textual data in Python is handled with _____.
    Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.
  6. The operators _____ and _____ test for membership. x _____ s evaluates to true if x is a member of s, and false otherwise.
    The operators in and not in test for membership. x in s evaluates to true if x is a member of s, and false otherwise.
  7. String methods include:
    String methods include: isalpha(), islower(), isnumeric(), isspace(), istitle(), isupper(), lower(), strip(), swapcase(), title(), and upper().
  8. Python substrings are referred to as _____.
    Python substrings are referred to as slices.
  9. Slices are accessed using the syntax _____.
    Slices are accessed using the syntax string[start:end].
  10. The first character index in a slice starts at _____. The slice will include _____. If end is omitted, the slice will include _____.
    The first character index in a slice starts at zero. The slice will include the characters from start up to but not including end. If end is omitted, the slice will include the characters from start through the end of the string.
  11. String slices may use negative indexing, in which case _____.
    String slices may use negative indexing, in which case the index counts backwards from the end of the string.
  12. The find() method returns _____.
    The find() method returns the lowest index in the string where a substring is found within the given slice, and returns -1 if the substring is not found.
  13. String objects have one unique built-in operation: the % operator (modulo). This is also known as _____.
    String objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string), % conversion specifications in format are replaced with zero or more elements of values.
  14. The str.format() method uses format strings that contain “replacement fields” surrounded by _____.
    The str.format() method uses format strings that contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output.

Assessments

See Also

References

  1. Python.org Built-in Functions
  2. Python.org: Built-in Types
  3. Python.org: Expressions
  4. Python.org: Built-in Types
  5. Python.org: Strings
  6. Python.org: Built-in Types
  7. Python.org: printf-style String Formatting
  8. Python.org: Format String Syntax
  9. Python, Real. "Python 3's f-Strings: An Improved String Formatting Syntax (Guide) – Real Python". realpython.com. Retrieved 2019-02-22.
  10. Wikipedia: String (computer science)
  11. Wikipedia: String (computer science)
  12. Python.org: Strings
  13. Python.org Built-in Functions
  14. Python.org: Built-in Types
  15. Python.org: Expressions
  16. Python.org: Built-in Types
  17. Python.org: Strings
  18. Python.org: Strings
  19. Python.org: Built-in Types
  20. Python.org: printf-style String Formatting
  21. Python.org: Format String Syntax
  22. PythonLearn: Strings
  23. PythonLearn: Strings
  24. PythonLearn: Strings
  25. PythonLearn: Strings
  26. PythonLearn: Strings
  27. PythonLearn: Strings
  28. PythonLearn: Strings
  29. PythonLearn: Strings
  30. PythonLearn: Strings
  31. PythonLearn: Strings
  32. PythonLearn: Strings
  33. PythonLearn: Strings
  34. PythonLearn: Strings
  35. PythonLearn: Strings
  36. PythonLearn: Strings
  37. PythonLearn: Strings
This article is issued from Wikiversity. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.