{"id":863,"date":"2026-07-02T10:30:05","date_gmt":"2026-07-02T09:30:05","guid":{"rendered":"https:\/\/guillaumesblog.net\/?p=863"},"modified":"2026-07-02T10:42:07","modified_gmt":"2026-07-02T09:42:07","slug":"build-my-ai-bot-to-serve-notes-from-my-voice-memos","status":"publish","type":"post","link":"https:\/\/guillaumesblog.net\/index.php\/build-my-ai-bot-to-serve-notes-from-my-voice-memos\/","title":{"rendered":"Build my AI bot to serve notes from my voice memos"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Let&#8217;s get the bot up on its feet and make use of the backend of our choice, in this instance a Oracle OCI Generative AI project. The bot needs to able to converse with the user (myself) via a chat app, it also needs to have a memory and be able to give me clues about my past memos. In addition, the bot knowledge is updated in real time as the user uploads voice notes day after day. The goal is simple, to automate note-taking from voice memos.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I ask the bot &#8220;What are the highlights for the 30th of June?&#8221; and it gives me key themes from my voice notes for that day. Equally I ask it if &#8220;Have I mentioned databases someday?&#8221; and it would answer with key excerpts from specific dates. You can now extrapolate and ask pretty much about anything on any date and the bot helps you remember. See below&#8230; (These are random voice notes)<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"473\" height=\"1024\" src=\"https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-473x1024.jpg\" alt=\"\" class=\"wp-image-865\" style=\"aspect-ratio:0.4619187672542899;width:302px;height:auto\" srcset=\"https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-473x1024.jpg 473w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-138x300.jpg 138w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-768x1664.jpg 768w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-709x1536.jpg 709w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram-945x2048.jpg 945w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/Screenshot_Telegram.jpg 1080w\" sizes=\"auto, (max-width: 473px) 85vw, 473px\" \/><figcaption class=\"wp-element-caption\">Image 1 &#8211; Bot retrieving information from my unstructured voice memos<\/figcaption><\/figure>\n\n\n\n<!--more-->\n\n\n\n<p class=\"wp-block-paragraph\">Since we are playing around with the services here,  all I offer is an exploratory concept and I leave it up to you to make it enterprise ready if you wish to do so. Now let&#8217;s have a look at the architecture proposed to make this bot work.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"724\" src=\"https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/image-1-1024x724.png\" alt=\"\" class=\"wp-image-873\" srcset=\"https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/image-1-1024x724.png 1024w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/image-1-300x212.png 300w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/image-1-768x543.png 768w, https:\/\/guillaumesblog.net\/wp-content\/uploads\/2026\/07\/image-1.png 1048w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">Image 2 &#8211; Architecture<\/figcaption><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The user uploads its voice recordings into <em>Voice Memos<\/em> bucket.<\/li>\n\n\n\n<li>Event service picks up an upload event in <em>Voice Memos<\/em> bucket, triggers a function adding all newly added recordings files into a list. My code is <a href=\"https:\/\/github.com\/geddegda\/mynotes_aibot\/blob\/main\/func1.py\">here<\/a>.<\/li>\n\n\n\n<li>Function ships the list to OCI Speech for transcription, transcriptions are text files (.json files) and end up into the <em>Transcriptions<\/em> bucket. <\/li>\n\n\n\n<li>Event service picks up an upload event in <em>Transcription<\/em> bucket, triggers another function.<\/li>\n\n\n\n<li>Function launches a Data Sync job from the Data Sync Connector. The transcriptions are now being vectorised and the LLM can now &#8220;digest&#8221; these new memos. In other words they are added to its &#8220;knowledge&#8221;. Code snippet is <a href=\"https:\/\/github.com\/geddegda\/mynotes_aibot\/blob\/main\/func2.py\">here<\/a>.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">I use functions since they are easy to trigger in response to an event happening on the platform. From the chat app side, the objective is to reach out the Generative AI Project to leverage its &#8220;intelligence&#8221;, so we need to set-up the front end. You can use a pre-packaged frontend like <a href=\"https:\/\/docs.openwebui.com\/\">Open WebUI<\/a> or a chat App, I choose Telegram for its ease of use.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>A &#8211; User reaches the bot channel (You create a bot channel prior, follow this procedure <a href=\"https:\/\/core.telegram.org\/bots\/features\">here<\/a> and use this <a href=\"https:\/\/t.me\/botfather\">link<\/a>. WhatsApp is also an option but it wants a Facebook account which is a no-go from me).<\/li>\n\n\n\n<li>B &#8211; You need the bot backend. I use a VM instance and the instructions are available <a href=\"https:\/\/docs.python-telegram-bot.org\/en\/stable\/\">here<\/a> and code is <a href=\"https:\/\/github.com\/geddegda\/mynotes_aibot\/tree\/main\/vm\">here<\/a>, continuously polling from the Bot&#8217;s channel.<\/li>\n\n\n\n<li>C &#8211; I feed <em>Guillaume&#8217;s<\/em> Telegram user inputs to the Generative AI project endpoint and get the answer back into the channel, where the user expects an answer. (please see Image 1 above). The agent is able to call a list of tools, in this instance a single tool to call in the vector store to get the latest vectorised voice memos transcripts, check <a href=\"https:\/\/github.com\/geddegda\/mynotes_aibot\/blob\/main\/vm\/generative.py\">here<\/a>.<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>def oci_gen_chat(entry):\n    response = client.responses.create(\n        model=\"xai.grok-4-1-fast-reasoning\",\n        input=entry,\n        <strong>tools=&#91;\n            {\n                \"type\": \"file_search\",\n                \"vector_store_ids\": &#91;\"vs_phx_abc\"]\n            }\n        ]<\/strong>\n    )\n    return response<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This setup uses a Generative AI project <em>&#8220;Build Agents with the OCI Responses API&#8221;<\/em>, part of Enterprise AI Agents in OCI Generative AI and is a managed service. Documentation is available <a href=\"https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/generative-ai\/agents.htm\">here<\/a> and <a href=\"https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/generative-ai\/projects.htm\">here<\/a>. You can also launch your own deployment and resources with <em>&#8220;Hosted Agentic Applications&#8221;<\/em>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If we look at the industry as a whole, data sovereignty and AI sovereignty is such a topic that to have the flexibility to chose deployment models, software building blocks, data region location and memory retention all from the cloud platform is handy. If data is of concern an obvious improvement is that Telegram is likely not the way to go, nor is Whatsapp, and I would recommend a front end under your control e.g. a VM running Open WebUI.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A few words and gotchas learned along the way:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>As usual, all calls from and to services on OCI need authorisation via IAM. This include Dynamic Group and Policies.<\/li>\n\n\n\n<li>It is mandatory to be familiar with the OCI <a href=\"https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/API\/Concepts\/sdk_authentication_methods.htm\">SDK Authentication Methods<\/a> so your code can make callouts to other services.<\/li>\n\n\n\n<li>Capture exceptions in functions with try-except blocks, plus logging and print(&#8220;your error&#8221;, flush=True) so can actually debug. I had many 403 and 404 errors related to lack of authorisation and wrong endpoints.<\/li>\n\n\n\n<li>Instead of editing your functions and redeploying with fn deploy repeatedly this way, use a local fn deploy, please see <a href=\"https:\/\/blogs.oracle.com\/cloud-infrastructure\/accelerate-functions-development-by-using-fn-server\">here<\/a>.<\/li>\n\n\n\n<li>I use Cloud shell and the local storage and podman cache fills up quickly, you have to <em>podman rmi $(podman images -q)<\/em> from time to time, see <a href=\"https:\/\/docs.oracle.com\/en-us\/iaas\/Content\/Functions\/Tasks\/functionscloudshellcleanup.htm\">here<\/a>.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">You got your bot now, hope this helps and see you soon!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Let&#8217;s get the bot up on its feet and make use of the backend of our choice, in this instance a Oracle OCI Generative AI project. The bot needs to able to converse with the user (myself) via a chat app, it also needs to have a memory and be able to give me clues &hellip; <a href=\"https:\/\/guillaumesblog.net\/index.php\/build-my-ai-bot-to-serve-notes-from-my-voice-memos\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Build my AI bot to serve notes from my voice memos&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":"[]"},"categories":[1],"tags":[],"class_list":["post-863","post","type-post","status-publish","format-standard","hentry","category-conversation"],"_links":{"self":[{"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/posts\/863","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/comments?post=863"}],"version-history":[{"count":22,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/posts\/863\/revisions"}],"predecessor-version":[{"id":888,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/posts\/863\/revisions\/888"}],"wp:attachment":[{"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/media?parent=863"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/categories?post=863"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/guillaumesblog.net\/index.php\/wp-json\/wp\/v2\/tags?post=863"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}