![]() |
Home | Accounts | Setup | Verify | Play | Hacks |
Playing with Jupyter Notebooks and Python
GitHub pages was built with Python and Jupyter Notebooks in mind. This post is to verify tools by using Python.
Python and Jupyter Notebooks
Python is a highly versatile and widely-used programming language, renowned for its readability and broad library support.
Jupyter Notebooks is an interactive computing environment that enables users to create and share documents containing live code, equations, visualizations, and narrative text.
Together, Python and Jupyter Notebooks, form a powerful toolkit for data analysis, scientific research, and educational purposes.
We will play with Python and Jupyter Notebooks to get a feel for both. This is a great interactive way to start development.
Emoji Print
It is easy to add an emoji to a message in code. However, using the emoji library or other libraries often requires you to install code on your machine. Before using a library, that is not part of Python distribution, you must install with pip
# terminal command to install library
$ pip install emoji
Collecting emoji
Downloading emoji-2.5.1.tar.gz (356 kB)
...
Successfully installed emoji-2.5.1
#!pip install emoji
from emoji import emojize
print(emojize(":thumbs_up: Python is awesome! :grinning_face:"))
👍 Python is awesome! 😀
Extracting Data
Web sites become a lot more interesting when you are working with data, not trying to create it. Here is some code using a library called newspaper, this extracts a couple of writeups from the CNN Entertainment site.
- Learn more on newspaper3k
- Learn about library for wikipedia
#!pip install newspaper3k lxml_html_clean
from newspaper import Article
from IPython.display import display, Markdown
urls = ["http://cnn.com/2023/03/29/entertainment/the-mandalorian-episode-5-recap/index.html",
"https://www.cnn.com/2023/06/09/entertainment/jurassic-park-anniversary/index.html"]
for url in urls:
article = Article(url)
article.download()
article.parse()
# print(article.title)
# Jupyter Notebook Display
display(Markdown(article.title)) # Jupyter display only
display(Markdown(article.text)) # Jupyter display only
print("\n")
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[5], line 2
1 #!pip install newspaper3k lxml_html_clean
----> 2 from newspaper import Article
3 from IPython.display import display, Markdown
6 urls = ["http://cnn.com/2023/03/29/entertainment/the-mandalorian-episode-5-recap/index.html",
7 "https://www.cnn.com/2023/06/09/entertainment/jurassic-park-anniversary/index.html"]
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/newspaper/__init__.py:10
7 __license__ = 'MIT'
8 __copyright__ = 'Copyright 2014, Lucas Ou-Yang'
---> 10 from .api import (build, build_article, fulltext, hot, languages,
11 popular_urls, Configuration as Config)
12 from .article import Article, ArticleException
13 from .mthreading import NewsPool
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/newspaper/api.py:12
9 __license__ = 'MIT'
10 __copyright__ = 'Copyright 2014, Lucas Ou-Yang'
---> 12 import feedparser
14 from .article import Article
15 from .configuration import Configuration
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/feedparser/__init__.py:28
1 # Copyright 2010-2023 Kurt McKee <contactme@kurtmckee.org>
2 # Copyright 2002-2008 Mark Pilgrim
3 # All rights reserved.
(...)
25 # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
26 # POSSIBILITY OF SUCH DAMAGE."""
---> 28 from .api import parse
29 from .datetimes import registerDateHandler
30 from .exceptions import *
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/feedparser/api.py:32
30 import urllib.error
31 import urllib.parse
---> 32 import xml.sax
34 from .datetimes import registerDateHandler, _parse_date
35 from .encodings import convert_to_utf8
ModuleNotFoundError: No module named 'xml.sax'
#!pip install wikipedia
import wikipedia
from IPython.display import display, Markdown # add for Jupyter
terms = ["Python (programming language)", "JavaScript"]
for term in terms:
# Search for a page
result = wikipedia.search(term)
# Get the summary of the first result
summary = wikipedia.summary(result[0])
print(term)
# print(summary) # console display
display(Markdown(summary)) # Jupyter display
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[6], line 2
1 #!pip install wikipedia
----> 2 import wikipedia
3 from IPython.display import display, Markdown # add for Jupyter
5 terms = ["Python (programming language)", "JavaScript"]
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/wikipedia/__init__.py:1
----> 1 from .wikipedia import *
2 from .exceptions import *
4 __version__ = (1, 4, 0)
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/wikipedia/wikipedia.py:3
1 from __future__ import unicode_literals
----> 3 import requests
4 import time
5 from bs4 import BeautifulSoup
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/requests/__init__.py:45
41 import warnings
43 import urllib3
---> 45 from .exceptions import RequestsDependencyWarning
47 try:
48 from charset_normalizer import __version__ as charset_normalizer_version
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/requests/exceptions.py:9
1 """
2 requests.exceptions
3 ~~~~~~~~~~~~~~~~~~~
4
5 This module contains the set of Requests' exceptions.
6 """
7 from urllib3.exceptions import HTTPError as BaseHTTPError
----> 9 from .compat import JSONDecodeError as CompatJSONDecodeError
12 class RequestException(IOError):
13 """There was an ambiguous exception that occurred while handling your
14 request.
15 """
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/requests/compat.py:30
26 pass
27 return chardet
---> 30 chardet = _resolve_char_detection()
32 # -------
33 # Pythons
34 # -------
35
36 # Syntax sugar.
37 _ver = sys.version_info
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/requests/compat.py:24, in _resolve_char_detection()
22 if chardet is None:
23 try:
---> 24 chardet = importlib.import_module(lib)
25 except ImportError:
26 pass
File /opt/homebrew/Cellar/python@3.12/3.12.5/Frameworks/Python.framework/Versions/3.12/lib/python3.12/importlib/__init__.py:90, in import_module(name, package)
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/charset_normalizer/__init__.py:24
2 """
3 Charset-Normalizer
4 ~~~~~~~~~~~~~~
(...)
20 :license: MIT, see LICENSE for more details.
21 """
22 import logging
---> 24 from .api import from_bytes, from_fp, from_path, is_binary
25 from .legacy import detect
26 from .models import CharsetMatch, CharsetMatches
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/charset_normalizer/api.py:5
2 from os import PathLike
3 from typing import BinaryIO, List, Optional, Set, Union
----> 5 from .cd import (
6 coherence_ratio,
7 encoding_languages,
8 mb_encoding_languages,
9 merge_coherence_ratios,
10 )
11 from .constant import IANA_SUPPORTED, TOO_BIG_SEQUENCE, TOO_SMALL_SEQUENCE, TRACE
12 from .md import mess_ratio
File ~/nighthawk/neil_2025/venv/lib/python3.12/site-packages/charset_normalizer/cd.py:14
5 from typing import Counter as TypeCounter, Dict, List, Optional, Tuple
7 from .constant import (
8 FREQUENCIES,
9 KO_NAMES,
(...)
12 ZH_NAMES,
13 )
---> 14 from .md import is_suspiciously_successive_range
15 from .models import CoherenceMatches
16 from .utils import (
17 is_accentuated,
18 is_latin,
(...)
21 unicode_range,
22 )
AttributeError: partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import)
Inspecting a Function
The inspect module can give you the output of what’s inside many Python functions/objects. This can help you explore code behind what you are using.
- Inspect documentation.
import inspect
from newspaper import Article
# inspect newspaper Article function
print(inspect.getsource(Article))
Python Data Types
Dynamic typing means that the type of the variable is determined only during runtime. Strong typing means that variables do have a type and that the type matters when performing operations. In the illustration below there are two functions
- mean… shows types required prior to calling average function
- average, average2… calculates the average of a list of numbers
Python has types. In the language you can use type hints, but most coders do not use them. In other languages like Java and ‘C’ you must specify types.
import sys
from typing import Union
# Define types for mean function, trying to analyze input possibilities
Number = Union[int, float] # Number can be either int or float type
Numbers = list[Number] # Numbers is a list of Number types
Scores = Union[Number, Numbers] # Scores can be single or multiple
def mean(scores: Scores, method: int = 1) -> float:
"""
Calculate the mean of a list of scores.
Average and Average2 are hidden functions performing mean algorithm
If a single score is provided in scores, it is returned as the mean.
If a list of scores is provided, the average is calculated and returned.
"""
def average(scores):
"""Calculate the average of a list of scores using a Python for loop with rounding."""
sum = 0
len = 0
for score in scores:
if isinstance(score, Number):
sum += score
len += 1
else:
print("Bad data: " + str(score) + " in " + str(scores))
sys.exit()
return sum / len
def average2(scores):
"""Calculate the average of a list of scores using the built-in sum() function with rounding."""
return sum(scores) / len(scores)
# test to see if scores is a list of numbers
if isinstance(scores, list):
if method == 1:
# long method
result = average(scores)
else:
# built in method
result = average2(scores)
return round(result + 0.005, 2)
return scores # case where scores is a single valu
# try with one number
singleScore = 100
print("Print test data: " + str(singleScore)) # concat data for single line
print("Mean of single number: " + str(mean(singleScore)))
print()
# define a list of numbers
testScores = [90.5, 100, 85.4, 88]
print("Print test data: " + str(testScores))
print("Average score, loop method: " + str(mean(testScores)))
print("Average score, function method: " + str(mean(testScores, 2)))
print()
badData = [100, "NaN", 90]
print("Print test data: " + str(badData))
print("Mean with bad data: " + str(mean(badData)))
Hacks
Here is a summary of some of the things learned above.
- Formatting messages with emoji
- Exploring data with newspaper and wikipedia libraries
- Finding code on how the library we used was made
- Learning about data types while writing an algorithm for mean
Part of Project Based learning is the idea of combining concepts to form something more interesting. Make a plan, form some ideas, brainstorm ideas with pair. Produce something that is interesting and challenging. Samples…
- Could I get input from user to look up wikipedia information? Python input, Article on Input
- What could I learn in Python about Stats to get Machine Learning Read? Stats Calculations
- Could I add emoji to an extracted article? String Find, String Methods