Skip to content

deeplearning update#33

Closed
pnposch wants to merge 26 commits into
SS2025from
main
Closed

deeplearning update#33
pnposch wants to merge 26 commits into
SS2025from
main

Conversation

@pnposch

@pnposch pnposch commented May 12, 2025

Copy link
Copy Markdown
Contributor

No description provided.

@github-advanced-security

Copy link
Copy Markdown

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Copilot AI review requested due to automatic review settings March 23, 2026 10:32

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the course repository structure and environment/dependency setup while adding mission submission materials and refreshing lecture/exercise notebooks.

Changes:

  • Introduce UV-based Python project config (pyproject/.python-version) and update README/setup guidance.
  • Add mission proposal template + grading rubric and link them from the lecture index.
  • Update/clean several notebooks (paths, installs, removed failing cells) and add a decision tree DOT artifact.

Reviewed changes

Copilot reviewed 17 out of 46 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
talks.md Adds a talks index placeholder referenced from lectures.md
pyproject.toml Adds project metadata, Python version constraints, and dependencies for UV
missions.md Removes old mission placeholder list
mission/mission_rubric.md Adds a detailed grading rubric for missions
mission/mission_proposal_template.ipynb Adds a proposal/final submission notebook template
lectures/talks/talks.md Removes the previous talks cloning instructions
lectures/intros/008_3D_yield_curve.ipynb Fixes data path for the yield curve notebook
lectures/decision_tree.dot Adds DOT export of a decision tree
lectures/02_CausalInference.ipynb Quiet pip installs + adds explicit dowhy install; updates outputs/metadata
lectures/01_Intro.ipynb Notebook metadata reshuffle (outputs/execution_count)
lectures/.ipynb_checkpoints/02_CausalInference-checkpoint.ipynb Adds a checkpoint notebook (should likely be excluded)
lectures.md Fixes links and adds mission section
gpt_connector.py Removes Perplexity API connector script
exercises/03a_supervised_learning_students version.ipynb Removes failing Boston dataset import cell; minor output normalization
exercises/02_CausalInference_solution.ipynb Updates dataset URLs to new /data location
README.md Switches dependency instructions from Poetry to UV
.python-version Pins Python version to 3.14
.gitignore Adds a large Python/Jupyter ignore set (with duplication)
.github/copilot-instructions.md Adds repo guidance for Copilot usage and conventions
Files not reviewed (7)
  • .idea/.gitignore: Language not supported
  • .idea/DAI.iml: Language not supported
  • .idea/copilot.data.migration.ask2agent.xml: Language not supported
  • .idea/inspectionProfiles/profiles_settings.xml: Language not supported
  • .idea/misc.xml: Language not supported
  • .idea/modules.xml: Language not supported
  • .idea/vcs.xml: Language not supported
Comments suppressed due to low confidence (1)

talks.md:1

  • This file is now a user-facing destination from lectures.md but only contains tbd. Either add at least the minimal talk listing (or restore the previous instructions from lectures/talks/talks.md), or avoid linking to it until it's populated.

Comment on lines +1 to +3
{
"cells": [
{

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Jupyter .ipynb_checkpoints file was added to the repo. These should typically not be committed (they're editor artifacts) and are already covered by the .gitignore patterns added in this PR. Please remove this file from version control and keep checkpoints ignored.

Copilot uses AI. Check for mistakes.
Comment thread lectures.md
[Introduction to data and AI in economics](https://colab.research.google.com/github/firrm/DAI/blob/main/lectures/00_Intro.ipynb)

[Paradox](https://colab.research.google.com/github/firrm/DAI/blob/main/lectures/01a_Paradox.ipynb)
[Paradox](https://colab.research.google.com/github/firrm/DAI/blob/main/lectures/01_Paradoxon.ipynb)

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lecture index now links to talks.md and to new Colab notebook paths (00_Intro.ipynb, 01_Paradoxon.ipynb). In this PR, talks.md is only a placeholder and the repo also contains lectures/01_Intro.ipynb, so these links are likely to be dead or confusing. Please either (1) populate talks.md with real content and ensure the Colab paths match actual notebook filenames, or (2) keep the previous working links.

Suggested change
[Paradox](https://colab.research.google.com/github/firrm/DAI/blob/main/lectures/01_Paradoxon.ipynb)
[Paradox](https://colab.research.google.com/github/firrm/DAI/blob/main/lectures/01_Intro.ipynb)

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines +46 to 47
To install UV, follow the instructions of your local OS

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UV installation instructions are too vague for students/contributors. Please link to the official UV install docs (or include short per-OS commands) so setup is reproducible.

Suggested change
To install UV, follow the instructions of your local OS
To install UV, follow the official installation instructions in the UV docs: https://docs.astral.sh/uv/getting-started/installation/
For convenience, here are example commands for common operating systems (check the docs above for the latest instructions):
- **macOS / Linux (shell):**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
  • Windows (PowerShell):

    iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex

After installation, you should be able to run uv --version in your terminal.

Copilot uses AI. Check for mistakes.
pnposch and others added 14 commits March 23, 2026 11:44
Addresses Dependabot security alerts in both uv.lock and requirements.txt:

- tornado 6.5.4 → 6.5.5  (CVE-2026-31958 DoS, HIGH)
- keras 3.13.0 → 3.13.2   (CVE-2026-1669 local file disclosure, CVE-2026-0897 resource exhaustion, HIGH)
- pillow 12.1.0 → 12.1.1  (CVE-2026-25990 OOB write, HIGH)
- cryptography 46.0.3 → 46.0.5  (CVE-2026-26007 subgroup attack, HIGH)
- nbconvert 7.16.6 → 7.17.0  (CVE-2025-53000 uncontrolled search path, HIGH)
- virtualenv 20.36.0 → 21.2.0  (CVE-2026-22702 TOCTOU, MEDIUM)
- filelock 3.20.2 → 3.25.2  (CVE-2026-22701 symlink TOCTOU, MEDIUM)
- numpy 2.4.0 → 2.4.3  (yanked release, backward compatibility bug)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
lectures/05a_deep_learning.ipynb:
- Cell 25: replace deprecated 'from tensorflow.keras import layers, Model'
  with 'from keras import layers, Model' (Keras 3.x standalone API)

lectures/05b_deep_learning.ipynb:
- Cell 1: remove 'import tensorflow as tf' (TF not in requirements)
- Cell 3 (sampling func): replace tf.random.normal/tf.shape/tf.keras.backend.exp
  with keras.random.normal/keras.ops.shape/keras.ops.exp
- Cells 5 & 7 (CustomVariationalLayer): replace all tf.keras.backend.*
  with keras.ops.* equivalents:
    tf.keras.backend.flatten(x) -> keras.ops.reshape(x, (-1,))
    tf.keras.backend.mean(...)  -> keras.ops.mean(...)
    tf.keras.backend.square(x) -> keras.ops.square(x)
    tf.keras.backend.exp(x)    -> keras.ops.exp(x)

All changes target Keras 3.x (keras==3.13.2) which removed the legacy
keras.backend functional API in favor of keras.ops (numpy-style ops).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pin vulnerable transitive dependencies in pyproject.toml to close
all 5 open Dependabot alerts (alerts #110–#119):

- cryptography >= 46.0.7  (buffer overflow, medium; DNS constraint, low)
- Pygments >= 2.20.0       (ReDoS, low)
- requests >= 2.33.0       (insecure temp file reuse, medium)
- poetry >= 2.3.3          (wheel path traversal, high)

uv.lock and requirements.txt regenerated accordingly.

Also adds changes.md summarising all security and notebook fixes
made in this maintenance pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ashes

- .python-version: 3.14 → 3.12.10 (dowhy 0.14 requires <3.14; tables has
  pre-built wheels for 3.12)
- pyproject.toml: requires-python >=3.12,<3.14; add tensorflow>=2.16
  so Keras 3 has a backend both locally and on Colab
- requirements.txt: regenerated with --no-hashes so %pip install -r URL
  works on Colab (hash-pinned file produced wrong-platform hashes)
- uv.lock: re-resolved for Python 3.12

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- 00_Intro: fix broken PSU ANSUR II URL -> load from data/ with GitHub fallback
- 02_CausalInference: monkey-patch dowhy for NumPy 2.x compatibility
  (convert_to_binary returned 1-element arrays, np.vectorize needs scalars)
- intros/004_Intro_to_numpy: replace removed NumPy 2.0 constants
  (np.NAN, np.NINF, np.NZERO, np.PZERO)
- 05a/05b: set KERAS_BACKEND=tensorflow before keras import (Colab compat)
- 05b: wrap np.prod(shape) in int() for Keras Dense layer (numpy.int64 issue)
- 05b: reduce VAE epochs 10->2 for CPU execution; add note to increase for GPU
- lectures.md: fix 7 intro links (case mismatch), add 008_3D_yield_curve,
  fix exercise link for section 2, add exercise links for sections 3/4/5
- data/: commit ANSUR II male (4082 rows) and female (1986 rows) datasets

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add openpyxl dependency (pandas xlsx support for exercise 02)
- Fix exercises/03b: add missing pandas import
- Fix exercises/05a: KERAS_BACKEND env, correct keras imports
- Fill in exercises/05b student template (GAN + VAE for energy prices)
  - Fix hardcoded os.chdir path (dynamic local/Colab detection)
  - Fix tensorflow.keras -> keras imports (Keras 3)
  - Fix keras.backend.exp -> keras.ops.exp (Keras 3)
  - Reduce iterations/epochs for CPU execution with comments for GPU
- Execute all remaining exercise notebooks (03c, 04, 05a, 05b) - all pass
- Execute lectures/05b_deep_learning.ipynb end-to-end - passes on CPU
- Update changes.md with exercise fixes and GPU/CUDA sysadmin info
- Add *.h5 / decision_tree.dot to .gitignore

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ents.txt

uv export produces exact pins (e.g. numpy==2.4.3). On Colab, this
downgrades the pre-installed numpy (2.4.4) which breaks scipy and
other packages compiled against the newer version:
  ImportError: cannot import name '_center' from numpy._core.umath

Switch requirements.txt to >=minimum constraints so pip skips
already-satisfied packages. Local reproducible installs continue
to use uv sync with uv.lock.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add requirements_colab.txt that only installs packages missing from
  Colab's default environment (mglearn, dowhy, empiricaldist, stargazer,
  wrds, newsapi-python, plotnine, tables).  Deliberately excludes numpy,
  scipy, requests, and cryptography to avoid:
    • numpy ABI breakage / 'cannot import _center from numpy._core.umath'
      (pre-compiled extensions crash after a numpy upgrade without restart)
    • 'you must restart the runtime' warning after numpy upgrade
    • google-colab/pyopenssl/pydrive2 conflicts from requests/cryptography
      version bumps

- Switch all 16 notebooks (9 lectures + 7 exercises) to install from
  requirements_colab.txt instead of requirements.txt

- 03c_supervised_learning.ipynb + exercise student version:
  Add '!apt-get install -y graphviz -qq' immediately after pip install.
  Colab does not ship the graphviz system binary, so graphviz.Source()
  raised ExecutableNotFound: failed to execute ['dot', ...]

- 05b_deep_learning.ipynb:
  Replace deprecated 'from keras.preprocessing import image' and
  'image.array_to_img(...)' with 'from keras.utils import array_to_img'
  (keras.preprocessing.image was removed in Keras 3.x)

- Also normalise 05b source arrays from single long strings to
  standard line-per-element format (no semantic change)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace static 'iframe' renderer with environment-aware detection:
- Uses 'colab' renderer in Google Colab
- Falls back to 'notebook' renderer in standard Jupyter

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Dockerfile: python:3.12-slim base with all course deps except TensorFlow/Keras (~1 GB image)
- .dockerignore: excludes .git, checkpoints, cache dirs, uv.lock
- docker-compose.yml: one-command start (docker compose up → JupyterLab on :8888)
  with ./my-work volume for persistent student notebooks
- .github/workflows/docker-publish.yml: builds and pushes ghcr.io/firrm/dai on every push to main
- README.md: add Quick Start with Docker section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
FITS GZIP decompression bomb vulnerability (GHSA-whj4-6x5x-4v2j, CVSS 8.7).
Pillow did not limit GZIP-compressed data when decoding FITS images, allowing
denial-of-service via unbounded memory consumption. Fixed in Pillow 12.2.0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Plotly's Colab renderer causes 'refused to connect' errors.
Switch to seaborn (already in requirements) for static, zero-config rendering:

- 5x px.histogram → sns.histplot(stat='probability')
- 1x px.histogram with marginal='box' → sns.histplot + sns.boxplot on subplots
- 3x ff.create_distplot (KDE-only) → sns.kdeplot per group
- Remove plotly.express, plotly.figure_factory, plotly.io imports
- Remove dead plotly.io import from 01_Paradoxon.ipynb

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Run 00_Intro.ipynb and 01_Paradoxon.ipynb so plots are
visible in GitHub's notebook renderer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
seaborn's hue= internally calls pandas reindex, which raises
ValueError when the DataFrame index has duplicate labels.
Fix: pass brfss/nsfg with .reset_index(drop=True) in cells 34 and 36.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pnposch pnposch closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants