How I Built An AI App To Help Busy (Lazy) People Improve Their English Speaking Skills

4 Jul 2024

Hey hackers 👋 I’m founder of Fluently, and I want to share my journey of creating a tool which help non-native professionals improve their English speaking skills with AI. Think Grammarly, but for video conferences.

How the Idea Came About

I lived in the States for over a year where I improved my English to an advanced level. I consume almost all information, including books in English.

But despite this, in the last 4 months of living in London while attending the Entrepreneur First accelerator, I noticed that sometimes people still asked me to repeat what I’ve said. I saw from their faces that they didn’t understand anything. That's hard to hide.

The funniest situation happened when a guy from Latvia quietly told me: "you pronounce 'vague' as 'wagyu'." The first word means unclear or indistinct, while the second is Japanese beef 😅

In short, I once again felt the desire to improve my English. The option of "oh well, they'll understand anyway, I'm not local" doesn't suit me at all. However, going to tutors is difficult. Firstly, you need to allocate time for lessons. Secondly, good tutors are expensive. Thirdly, the amount of feedback after an hour-long lesson is not much for a week.

The Solution: Fluently

My background as an ML engineer got me thinking: What if AI could track and highlight my common mistakes, and assist me in correcting them? Like a virtual tutor that fits organically into my daily routine and takes my English to a new level. That’s how Fluently started.

First of all, I decided to spend a week to see if anyone else was interested. So I put together a landing page where you can sign up for the waitlist, and on the site itself, I simply outlined the main values of the product. The site hasn’t changed yet.

Then I showed this landing page to my friends with the approach "hey, look what I found" to see how they would react. Surprisingly, 2/3 showed interest in the app and signed up for the waitlist on the site. The rest either have good English skills or don’t care.

Also I shared a short demo video of Fluently in my Telegram channel, and a few friends reposted it. As a result, more than 200 people joined the waitlist. That was a big sign for me to start building!

What is Fluently now?

Fluently is an Mac app designed to help non-native professionals improve their English by giving instant feedback after online calls, like those on Google Meet or Zoom. Imagine that you have a personal coach who offers tips right after each call.

To try Fluently, download an app and follow these easy steps:

  1. Start the app: Fluently activates when you speak English during online meetings.
  2. Get feedback: After each call, Fluently provides feedback tailored to your language challenges, pointing out mistakes and suggesting improvements.
  3. Track progress: Over time, you can see where you’re improving and where you still need practice.

Here is an example of the Fluently feedback:

Some of Fluently key benefits and features:

  • Real-life feedback: Helps fix mistakes from real calls, not just in classrooms.
  • Effortless learning: No extra time needed – just learn from your usual meetings.
  • English Level Assessment: Evaluates your English level and identifies your strongest and weakest areas.
  • AI Tutor: Allows for English practice even when you’re not on calls.

The Tech Side

Since the app is for MacOS, I decided to write the client in Swift. We could have looked at Electron, but native apps always feel better. And if we have to dive into something low-level, it will be faster to solve issues in Swift.

Currently, the app detects the start of a call and begins analyzing the user's audio in small chunks, processing them on the server. Only the user's speech is analyzed, and the speech of the interlocutor is not even heard by the app (except in some cases of loud conversations without headphones).

The backend is written in Python, and the ML models are on PyTorch. The server receives the audio and detects pronunciation errors, which are sent back to the app.

I won’t go into the implementation details of the pipeline itself, as that’s a topic for a separate post. To simplify, everything is arranged as follows: the audio is recognized into text, the text is translated into phonemes, and a separate model checks how well they match the sounds spoken in the audio recording.

Privacy: of course we don’t collect the recordings themselves, only keeping statistics on errors. Besides, we don’t need English speech with a strong accent. I can record that for hours myself 😅

Try Fluently to evaluate your English level in 4 mins

Next Steps

I’m continuously working to improve Fluently. First of all, we are building web and mobile apps. To significantly increase Fluently potential audience. And creating daily exercises like Duolingo, based on your most common mistakes. It’s about making learning fun and useful for daily routine.

Got questions or feedback? I’m here to hear from you!