- [Prerequisites](#prerequisites) - [Environment variables](#environment-variables) - [Configuration](#configuration) - [General Settings](#general-settings) - [Overlay Icon Configuration](#overlay-icon-configuration) - [File Configuration](#file-configuration) - [API Request Settings](#api-request-settings) - [Transcription Settings](#transcription-settings) - [YouTube Settings](#youtube-settings) - [AI Model Settings](#ai-model-settings) - [Image Generation Settings](#image-generation-settings) - [Art Downloader Settings](#art-downloader-settings) - [Content Documentation Settings](#content-documentation-settings) - [Vale](#vale) - [Add the daily Youtube video](#add-the-daily-youtube-video) - [Add additional Youtube video to a day](#add-additional-youtube-video-to-a-day) - [Add today's sharing](#add-todays-sharing) - [Merge monthly markdown files into one large README](#merge-monthly-markdown-files-into-one-large-readme) - [Generate table of contents for markdown files](#generate-table-of-contents-for-markdown-files) - [Using the commands utility that accepts text commands in file `commands.txt`](#using-the-commands-utility-that-accepts-text-commands-in-file-commandstxt) # HOWTO ## Prerequisites - [curl](https://curl.se/) - [gm](http://www.graphicsmagick.org/) - [m4](https://www.gnu.org/software/m4/m4.html) - [stitchmd](https://github.com/abhinav/stitchmd) - [doctoc](https://github.com/ktechhub/doctoc) - [mdformat](https://github.com/hukkin/mdformat) - [markdownlint](https://github.com/DavidAnson/markdownlint) - [vale](https://github.com/errata-ai/vale) - [yt-dlp](https://github.com/yt-dlp/yt-dlp) - [exiftool](https://exiftool.org/) 1. Create an empty `videos.txt` file under the root directory. 1. Create a directory for each month of the year under the root directory. 1. Add a `header.md` file under each monthly directory with the following content. Example for January:
<!-- toc --><!-- tocstop -->
```markdown # January 2025 RIAY January 2025 ``` `markdown-toc-gen` won't generate the table of contents for the monthly markdown (in this case, `January.md`) without the mandatory `<-- toc -->` and `` comments header. You can replace the top-level markdown header ```markdown # January 2025 RIAY January 2025 ``` with your own if you wish. Add a `compact.txt` file with the first line as `header.md` under each monthly directory. This ensures the presence of the header for each month's markdown. Note: You can do all this by simply executing the script `setup`. ### Environment variables 1. Export an environment variable `GITHUB_USERNAME` by adding the following line to your `.bash_profile` file. 1. Substitute your Github user id for ``. ```bash export GITHUB_USERNAME="" ``` 1. Export environment variable `YOUTUBE_API_KEY` by adding the following line to `.bash_profile`. ```bash export YOUTUBE_API_KEY= ``` Substitute your Google API Key which can access YouTube Data API. Set up your API key using instructions at 1. Export environment variable `DEEPSEEK_API_KEY` by adding the following line to `.bash_profile`. ```bash export DEEPSEEK_API_KEY= ``` Substitute your DeepSeek API Key which can access YouTube Data API. Set up your API key using instructions at 1. Export environment variable `DEEPINFRA_TOKEN` by adding the following line to `.bash_profile`. ```bash export DEEPINFRA_TOKEN= ``` **Note:** You will need to set up either DeepSeek or Gemini API keys for AI-generation of podcast summaries. ### Configuration The configuration file `config.env` is located in the repo's root directory. It contains the following settings: #### General Settings ```bash # Project config values PROJECT="Rosary In A Year (RIAY)" # Turn on or turn off logging LOGGING=false # The year in which the podcasts are being followed YEAR=2025 # Github config values REPO_OWNER=linusjf REPO_NAME=RIAY ``` #### Overlay Icon Configuration ```bash ICON_FILE="play-button.png" ICON_SIZE="256x256" ICON_OFFSET="+32+0" ICON_COMMENT="Play Icon Added" ``` #### File Configuration ```bash COMPACT_FILE="compact.txt" VIDEOS_FILE="videos.txt" ``` #### API Request Settings ```bash # Maximum number of retries for a REST API call CURL_MAX_RETRIES=5 # Initial retry delay (seconds) - increases exponentially CURL_INITIAL_RETRY_DELAY=2 # Connection timeout for curl requests (seconds) CURL_CONNECT_TIMEOUT=30 # Maximum time for curl operations (seconds) CURL_MAX_TIME=90 ``` #### Transcription Settings ```bash # Whether to transcribe videos TRANSCRIBE_VIDEOS=false # Whether to transcribe locally TRANSCRIBE_LOCALLY=false # Whether to use faster-whisper python library USE_FASTER_WHISPER=true # Whether to enable failover mode ENABLE_FAILOVER_MODE=true # ASR API KEY ASR_LLM_API_KEY="$DEEPINFRA_TOKEN" # ASR LLM base url ASR_LLM_BASE_URL="https://api.deepinfra.com/v1" # ASR LLM endpoint ASR_LLM_ENDPOINT="/openai/audio/transcriptions" # ASR LLM model ASR_LLM_MODEL="openai/whisper-large-v3" # ASR local model ASR_LOCAL_MODEL="small" # Initial prompt for whisper ASR_INITIAL_PROMPT="In Ascension Press' Rosary in a Year podcast, Fr. Mark-Mary Ames meditates with sacred art, saint writings, and scripture. Poco a poco." # Whether to carry initial prompt ASR_CARRY_INITIAL_PROMPT=true # Beam size for ASR ASR_BEAM_SIZE=5 ``` #### YouTube Settings ```bash # Number of retries for yt-dlp YT_DLP_RETRIES=20 # Timeout for yt-dlp (seconds) YT_DLP_SOCKET_TIMEOUT=30 # Captions output directory CAPTIONS_OUTPUT_DIR="captions" ``` #### AI Model Settings ```bash # Temperature for LLM responses (0 for deterministic output) TEMPERATURE=0.5 # text llm api key TEXT_LLM_API_KEY="$DEEPSEEK_API_KEY" # text llm base url TEXT_LLM_BASE_URL="https://api.deepseek.com" # text llm chat model endpoint TEXT_LLM_CHAT_ENDPOINT="/chat/completions" # text llm model used for summarization TEXT_LLM_MODEL="deepseek-chat" # The system prompt for summarizing text. SYSTEM_SUMMARY_PROMPT="You are a helpful assistant that summarizes content. Be concise, helpful." # The prompt for summarizing chunks within the podcast transcript. CHUNK_SUMMARY_PROMPT="Summarize this text, excluding plugs, branding, and promotions. Avoid mention of Day, podcast and Rosary in a Day. Additionally, exclude repetitive prayers such as Our Father, Hail Mary and Glory Be. Retain details of artwork decribed including title, artist (full name), current location of artwork, medium, style, date (year, decade and century) and brief description." # The prompt for the final summary of all the chunk summaries FINAL_SUMMARY_PROMPT="Summarize the following text in the voice and style of C.S. Lewis, employing his clarity, moral insight, and rhetorical flair, as if writing directly to an intelligent but unassuming reader. Generate a title as well. Avoid modern jargon, maintain a tone of gentle conviction, and present the ideas as timeless truths. Do not include any meta-commentary, footnotes, or explanations. Retain details of artworks. Start with a level three markdown header '### AI-Generated Summary: '. Append your generated title (no colons, proper grammar) to the header." # The meta-prompt to generate image prompt and caption SUMMARY_IMAGE_META_PROMPT='You are an assistant that extracts visual meaning from text. Given any descriptive input, generate: 1. "image_prompt" - a detailed, visual description suitable for generating an image. 2. "caption" - a brief, expressive caption that summarizes or enhances the image concept. No colons. Return your response as a JSON object in the format: {"caption": "This is the generated caption", "image_prompt": "This is the image prompt"}' CAPTION_PROMPT='You are an assistant who extracts visual meaning from text related to Christian art. Given any descriptive input, generate: 1. "caption" - a brief, expressive caption that summarizes or enhances the text using any information you have concerning Christian art the text identifies. Return your response as a JSON object in the format: {"caption": "This is the generated caption"}' SUMMARY_ARTWORK_DETAILS_PROMPT="From the following text, extract any described artwork and return a JSON object with: \"details\": A detailed summary including the artwork's title, artist, date, medium, style, current location of artwork (location) and subject. Fill values with empty string if not specified in the text. \"filename\": A lowercase alphanumeric string (no hyphens or underscores), 20 characters or fewer, representing a mostly unique filename derived from the title, artist, and date. \"caption\": A caption (in proper English), no more than 20 words, that uses title, artist, date,location, medium and subject (in that ordered priority). If no artwork is described in the text, return an empty JSON object {}." ``` #### Image Generation Settings ```bash # Whether to auto-generate images AUTO_GENERATE_IMAGES=false # Path to image generation script IMAGE_GENERATION_SCRIPT="openaigenerateimage" # Deepinfra image generation model DEEPINFRA_IMAGE_GENERATION_MODEL="stabilityai/sd3.5" # Fal.ai image generation model FALAI_IMAGE_GENERATION_MODEL="janus" ``` #### Art Downloader Settings ```bash # Whether to auto-download art AUTO_DOWNLOAD_ART=false # Art downloader directory ART_DOWNLOADER_DIR="artdownloads" # Whether to verify art images VERIFY_ART_IMAGES=true # Art verifier script ART_VERIFIER_SCRIPT="matchimagetometadata.py" # Art metadata prompt ART_METADATA_PROMPT="You are an expert on Christian art. Provided the following details about the artwork '{}' analyze the image and generate the following detailed metadata in json format (no markdown, no nesting of attributes, all top-level): \"title\": The title of the artwork, \"artist\": The artist or artists, \"medium\": oil on canvas, fresco, marble sculpture, etc., \"location\": where the artwork is currently located, \"date\": the creation year and century, \"style\": the artistic style or artistic school that influenced the art, \"description\": Description of the artwork (including visual elements, composition, subject matter, and style). \"image_color\": Whether the image is in Color, Grayscale, Monochrome, Duotone/Tritone, Sepia,Color-tinted grayscale, Black-and-white etc. \"watermarked\": Whether the image is watermarked or not. \"caption\": A caption (in proper English), no more than 20 words, that uses title, artist, date, location, medium and description (in that ordered priority). \"analyzed\": Whether the analysis was possible or not. \"comments:\": Your comments other than the fields above and if analysis was possible or not and why. Do not add any extraneous information that will mangle the json object expected." # Image content validation strictness IMAGE_CONTENT_VALIDATION="lenient" # Vector embeddings model api key VECTOR_EMBEDDINGS_MODEL_API_KEY="$DEEPINFRA_TOKEN" # Vector embeddings provider base url VECTOR_EMBEDDINGS_BASE_URL="https://api.deepinfra.com/v1/openai" # Vector embeddings model VECTOR_EMBEDDINGS_MODEL="thenlper/gte-large" # Whether to look for alternate images FIND_ALTERNATE_IMAGES=false ``` #### Content Documentation Settings ```bash # List of files for generating documentation CONTENT_DOCS=( "RIAY=start.md" "January=January.md" "February=February.md" "March=March.md" "April=April.md" "May=May.md" "June=June.md" "July=July.md" "August=August.md" "September=September.md" "October=October.md" "November=November.md" "December=December.md" ) ``` The config values can be modified as per your preferences. ### Vale Initialize `vale` styles by executing the command `vale sync`. This should download the specified styles in `.vale.ini`. ## Add the daily Youtube video Execute the script `addvideo` with the following parameters: - video id - the id of the Youtube video - caption or title (in double quotes) Example: ```bash ./addvideo 5I2BbalTOPo "Hagar and Ishmael" ``` Results: 1. Computes the `day of year` from the length of the videos.txt file. `day of year = (number of lines in videos.txt) + 1` In this case, 10. 1. Appends the Video id to the file `videos.txt` present in the root directory. 1. Generates markdown file `Day010.md` in the `January` subdirectory. 1. This markdown file has a link to the Youtube video. 1. It also has AI-Generated summary of the podcast. 1. Generates image file `Day010.jpg` in the `January/jpgs` directory. 1. Appends `Day010.md` filename to the `January/compact.txt` file. 1. Updates `January.md` file in the root directory with the contents of `Day010.md`. 1. Updated files: 1. `./videos.txt` 1. `./January.md` 1. `./January/compact.txt` 1. Created files: 1. `./January/Day010.md` 1. `./January/jpgs/Day010.jpg` ## Add additional Youtube video to a day When you have to add an additional video to the markdown for that day, you can execute the script `addvideotoday` with the following parameters: - video id - the id of the Youtube video - day of year - the day of the year for which the video is to be added Example: ```bash ./addvideotoday 5I2BbalTOPo 21 ``` Results: 1. Updates markdown file `Day021.md` in the `January` subdirectory. 1. This markdown file has a link to the Youtube video. 1. It also has AI-Generated summary of the podcast. 1. Generates image file `.jpg` in the `January/jpgs` directory. 1. Updates `January.md` file in the root directory with the contents of `Day021.md`. 1. Updated files: 1. `January/Day021.md` 1. `./January.md` 1. Created files: 1. `./January/jpgs/.jpg` ## Add today's sharing 1. First, add today's video. 1. Edit the generated `Dayxxx.md` file for today. 1. Paste the sharing text into the file adding appropriate markdown headers as needed. 1. Save the file. 1. Execute script `genmonth` with the following parameters: - month index - 1 - 12 - optional four digit year - 20XX Example: ```bash ./genmonth 01 2025 OR ./genmonth 01 # The year value will be picked from the environment variable YEAR ``` Results: Updates the `January.md` file with the sharing text added to the `Day010.md` file. You can add sharing to other days as well in a similar fashion. Don't forget to execute `genmonth` with the appropriate month index for that day. You can get the month index by executing the following bash command: ```bash date --date="$(date --date='jan 1 + 30 days' '+%B %d, %Y')" +%m ``` Decrement the day of year by 1 and substitute it in the command. The preceding gives the month index for day 31. ## Merge monthly markdown files into one large README 1. Edit the `stitch.md` file provided to include the markdown files you wish to merge. 1. The file format follows: ```markdown # README - [RIAY](start.md) - [January](January.md) - [February](February.md) - [March](March.md) - [April](April.md) - [May](May.md) - [June](June.md) - [July](July.md) - [August](August.md) - [September](September.md) - [October](October.md) - [November](November.md) - [December](December.md) ``` Include or exclude any files you need or don't need. 1. Execute the `stitch` script. ```bash ./stitch ``` Results: Generates README with all the contents of the listed markdown files in `stitch.md`. ## Generate table of contents for markdown files Execute the `gentoc` script as follows: ```bash ./gentoc ``` Before executing the script, update the file and place the comment `` and `` to generate the table of contents inside these markers. Results: Generates the table of contents per the existing headings in the markdown file. ## Using the commands utility that accepts text commands in file `commands.txt` 1. Install ANTLR4 Use [pyenv](https://github.com/pyenv/pyenv) to install and set up your Python3 environment. Example of setting up your Python and ANTLR4 environment ```bash pyenv install 3.10 pyenv global 3.10 pip install antlr4-tools antlr4-python3-runtime python-dotenv ``` 1. Add commands to the `commands.txt` file. Available commands: - addvideo - addvideotoday - addimgtoday - genmonth - lintall - stitch - gentoc **Note:** The `addvideotoday` and `addimgtoday` commands need the day of year to be a three digit number. Hence, 1 becomes 001, 20 becomes 020 and 99 is 099. For simplicity and consistency, the commands wrap their command line equivalents. 1. Execute the commands.py script. ```bash ./commands.py ``` This executes the commands in order as placed in the `commands.txt`. If any command fails, the program outputs an error message for that command and executes all following commands. 1. Example `commands.txt` file ```text # example commands addvideo "abc123456" "Example video" # add example video addvideotoday "abc123456" 010 # add example video to day 10 markdown. genmonth 01 2025 # generate markdown for month January, 2025 lintall # lint all the markdown files ``` The program ignores everything after the `#` symbol and treats it like a new line character.