Suppose we have two Docker containers, A and B. After processing the data in the container A, we send the data to the container B. The container B performs another processing and returns the data to the container A. I will explain how to achieve this sequence of operations with an actual example.
Before explaining, we will identify some important points.
-
The Address for accessing the container B must be specified correctly. It's vital.
- When two containers are launched in the docker-compose.yml file, they are assigned the same network.
- Containers in the same network can communicate with each other using each other's
service name
andport number in the network
.
-
container A is started after container B is started.Be aware of the order of running containers
- The container B, as a server, receives data from the container A, so it must be started first.
- Specify the order of startup with
depends_on
in the docker-compose.yml file.
-
Wait a certain amount of time for a value to be returned from the container B.
- If you don't set the wait for a certain period of time, the container will finish running before the value is returned.
Now let's look at the implementation flow of each container.
Creating the container A
The role of the container A is to obtain a transcription of a YouTube video.
Specify the URL of the YouTube video
Obtain the id from the URL.
Based on the id, use a Python library to obtain a transcription of the video.
Send the text data of the transcription to the container B.
Receive the text data processed separately in the container B.
Save the received text to a file.
The script that implements the above flow is as follows
import urllib.parse
from youtube_transcript_api import YouTubeTranscriptApi
import requests
url = "https://www.youtube.com/watch?v=CJjSOzb0IYs"
parsed_url = urllib.parse.urlparse(url)
query_params = urllib.parse.parse_qs(parsed_url.query)
video_id = query_params.get("v") # returns an array including the value next to "v=..."
if video_id:.
video_id = video_id[0].
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)
transcription = ""
for t in transcript_list:.
transcription += " "+ t["text"]
with open("transcription.txt", "w") as file: file.write(transcription)
file.write(transcription)
# Send the request and wait for the response until 5 sec
try: # Send the request and wait for the response until 5 sec
posted = requests.post("http://punctuate_text:5000/punctuate", data={"text": transcription}, timeout=5)
posted.raise_for_status() # Raise an exception for non-2xx responses
if posted.status_code == 200: with open("output.txt")
with open("output.txt", "w") as file: file.write(posted.text)
file.write(posted.text)
except requests.exceptions.RequestException as e:
print("Error occurred:", e)
else: print("No video ID found.
print("No video ID found.")
⭐️ The most noteworthy line is the following one.
posted = requests.post("http://punctuate_text:5000/punctuate", data={"text": transcription}, timeout=5)
punctuate_text
: The service name of the container B specified in the docker-compose file that will appear later.5000
: port number in the network, also specified in the docker-compose file./punctuate
: The root address specified in the Flask server of the container B.
The Dockerfile for container creation is as follows.
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python3", "app.py"].
Creating container B
The role of the container B is to punctuate the transcription sent from the container A.
Create a server using Pytyon's Flask.
Create a route for creating the punctuation.
Receive text data POSTed to the route.
Punctuate the text using SpaCy, Python's natural language processing library.
Returns the punctuated text.
Specify the server port number 5000.
The script that implements the above flow is as follows
from flask import Flask, request
import spacy
app = Flask(__name__)
nlp = spacy.load("en_core_web_sm")
@app.route('/punctuate', methods=['POST'])
def punctuate_text():.
text = request.form.get('text')
doc = nlp(text)
punctuated_text = ""
for sentence in doc.sentences:.
punctuated_text += sentence.text + ". "
return punctuated_text
if __name__ == '__main__':.
app.run(host='0.0.0.0', port=5000)
The Dockerfile for container creation is as follows.
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install -U pip setuptools wheel
RUN pip install --no-cache-dir -r requirements.txt
RUN python -m spacy download en_core_web_sm
COPY . .
CMD ["python3", "app.py"].
Creating the Docker Compose file
The entire file should look like this
# docker-compose.dev.yml
version: "3.8"
services: # Docker-Compose.dev.yml version: "3.8
# Container A
services: # Container A
build: .
context: . /python
container_name: transcription
depends_on: - depends_on: .
-. /python container_name: .
volumes: .
- . /python:/app
# Container B
punctuate_text: - volumes: .
build: .
context: . /punctuate
container_name: punctuate
ports: 8000:5000
- 8000:5000
A Important part of the container description.
depends_on:: ``docker
- punctuate_text
- By specifying the service name of the container B, you can start the container A after the container B.
Important parts of the container B description.
ports:
- 8000:5000
- The
5000
number is the number in the network, the same as the port number of the Flask server in the container B.
Run docker-compose command
Start the two containers.
## docker-compose -f docker-compose.dev.yml up -d
Summary
- The address to access the containers is determined by the service name and port number in the docker-compose file.
- Pay attention to the order in which containers are started.
- Set the time to wait for a response from the server.
Conclusion
It took me a while to solve the problem because I didn't know how Docker was resolving the address dynamically.
Top comments (0)