Background
Up until now, my team at work were primarily PHP developers. We recently deployed our first Go application (a REST API) to our production server to take care of some of the backend functions of our primary Symfony app.
Issue
After 15-20 minutes of inactivity, the next request to our API would fail because of an invalid connection
error when trying to make any database query, but then the next query would be successful.
This would normally happen overnight when we would have much lower traffic to our API.
I suspected the idle connections were being killed by the MySQL server. We never had this problem with PHP because a new MySQL connection would be made on each request, or if a PHP script was running for more than 15 minutes it would be constantly making requests keeping the connection alive.
Solution
I thought of adjusting the timeout on the MySQL server but it is managed for us by another team so I was hoping to find a solution that my team could implement.
Looking through go doc sql.DB
I saw it had a Ping()
method. The description says: Ping verifies a connection to the database is still alive, establishing a connection if necessary.
That sounded exactly like what would solve my problem, and it did! I just set a simple goroutine to run the Ping()
method every 5 minutes.
package main
import (
"database/sql"
"log"
"time"
_ "github.com/go-sql-driver/mysql"
)
var db *sql.DB
func main() {
db, _ = sql.Open("mysql", "user:password@/dbname")
// Error handling omitted for this example
go periodicPing()
// Start the web server
}
func periodicPing() {
for {
time.Sleep(time.Minute * 5)
err := db.Ping()
if err != nil {
log.Println(err)
}
}
}
Update (Jan 3, 2019)
I acknowledge that this was not a good solution and I wanted to come back and reference an update to my issue:
Top comments (9)
thumbs up for writing a post, even though its short and simple, one needs some courage to write. I never had it so far))
“MySQL server [but it] is managed for us by another team” - this is the “problem” of your software. Its definitely worth fixing, by adding devops skills to a team and replicating (via event/data streams) to your own database.
this way you could solve original problem in a cleaner way - by configuring your database )
Thanks! I know this is part of the point of doing it, but this was my worst fear, putting something out there and then being wrong. lol
I'm not convinced replicating the whole database is a good use of resources. While I agree it would be good to learn how to manage the database within our team, I still would rather have a dedicated database team monitoring and maintaining the server. But maybe I misunderstood what you meant.
What I meant is that in the age of cloud and SaaS you can have DB instance with little maintenance overhead - of course, you would need some DBA skills in your team, but its miles away from “traditional” database admin guy. Data-wise there are techniques like domain-driven design with notion of bounded contexts, that allow you to define, which data writes belong to which service (team, product), and what are transaction boundaries (if any). With this knowledge you can split data by services, replicate data from other services as read-only (as in event-based or event-driven setup, not mysql binlog replication) and include DB/data managent in your team. At this moment all your infrastucture and its config can be moved to your git repo, as in Infrastructure as a Code, and DB timeout parameter becomes part of yet another pull request to be deployed by you (or, ideally, your CD pipeline)
Gotcha, thanks for the explanation! You make some good points.
It's a way to solve this problem. But you cannot be sure 5 minutes will be always enough to keep connections up. DBA may change connection timeouts in any time for bunch of well-reasonable reasons. You should know that connections to DB is pretty expensive resource in high-loaded Databases. And similar behavior of your application by default may be considered as destructive. So I suppose that such behavior should be agreed with DBA/Architecture guys and Ping intervals should be configurable.
To be honest I consider existence of db.Ping() as bad API solution. I cannot imagine situation when I'd need to make request to disconnected database without prior connection to get obvious fail. So I suppose it is excessive and should be moved inside of all request functions of db.
Thanks for the insight, I'll definitely discuss this with our DB team to see if we can work out something better.
You could also just set the when you set the connection
db.SetConnMaxLifetime(5.time.Minute)
I don't think it's a good workaround. What happens if you lost connection but your goroutine is still waiting for next run?
Also, instead of for+sleep you can use time.NewTicker
So would the suggestion here be to retry faster if the connection is lost?
Thanks, I'll look into that!