MySQL variables are a handy way to store data from a row, so that it can be used in calculations with other rows of data.
Setting Variables
Variables must always have a preset value. Without the preset value, the query will still run, but the results will not give the expected outcome.
Variables can be set by using SET
The variable can also be set using a subquery.
SET @last_date = '1970-01-01'
SELECT * FROM seaservice,(SELECT @last_date := '1970-01-01') init
Setting Up Your Database
For the purposes of this article I am setting up a database as follows:
CREATE TABLE `seaservice` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`vessel` int(11) NOT NULL,
`position` int(11) NOT NULL,
`start_date` datetime NOT NULL,
`end_date` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
INSERT INTO `seaservice` (`id`, `user_id`, `vessel`, `position`, `start_date`, `end_date`) VALUES
(1, 1, 1, 1, '2023-01-01 00:00:00', '2023-01-01 00:00:00'),
(2, 1, 1, 1, '2023-01-02 00:00:00', '2023-01-02 00:00:00'),
(3, 1, 2, 1, '2023-01-03 00:00:00', '2023-01-03 00:00:00'),
(4, 1, 1, 2, '2023-01-04 00:00:00', '2023-01-04 00:00:00'),
(5, 1, 1, 2, '2023-01-05 00:00:00', '2023-01-05 00:00:00'),
(6, 1, 2, 1, '2023-01-06 00:00:00', '2023-01-06 00:00:00');
Using A Variable In A Query
To return the value of a variable in a query, just add the variable to the select statement
SELECT id,@last_date FROM seaservice,
(SELECT @last_date := '1970-01-01') init
This will return the following data
id | @last_date |
---|---|
1 | 1970-01-01 |
2 | 1970-01-01 |
3 | 1970-01-01 |
4 | 1970-01-01 |
5 | 1970-01-01 |
6 | 1970-01-01 |
Assigning Values To Variables
Assigning a value to a variable is done using
@last_date := last_date
the := syntax is equivalent to SET
In a query this would look like
SELECT id,@last_date,@last_date := end_date FROM seaservice,
(SELECT @last_date := '1970-01-01') init
Each time the query hits a row, it will update the @last_date with the date in the current table row.
id | @last_date | @last_date := end_date |
---|---|---|
1 | 1970-01-01 | 2023-01-01 00:00:00 |
2 | 2023-01-01 00:00:00 | 2023-01-02 00:00:00 |
3 | 2023-01-02 00:00:00 | 2023-01-03 00:00:00 |
4 | 2023-01-03 00:00:00 | 2023-01-04 00:00:00 |
5 | 2023-01-04 00:00:00 | 2023-01-05 00:00:00 |
6 | 2023-01-05 00:00:00 | 2023-01-06 00:00:00 |
Note that the two columns contain different dates. The @last_date is displaying the currently stored value. The assigning column is displaying what is going to be stored for the next row to use.
Important Note: The order you put these in matters. If you set the value before the displaying of the value, then it will change the outcome.
Both columns will return the same value if the two fields in the select statement are swapped.
SELECT id,@last_date,@last_date := end_date FROM seaservice,
(SELECT @last_date := '1970-01-01') init
Is not the same as
SELECT id,@last_date := end_date,@last_date FROM seaservice,
(SELECT @last_date := '1970-01-01') init
Using The Variables In Calculations
Before the query get more complex, it can be a good idea to relegate the variable assignment to a sub query and just return the fields you really need later.
SELECT id,last_date,assigned_date
FROM (SELECT id,@last_date last_date,
@last_date := end_date assigned_date
FROM seaservice,
(SELECT @last_date := '1970-01-01') init)t
Now instead of returning the date values, we can return a calculation.
SELECT id,DATEDIFF(assigned_date,last_date) as days
FROM (SELECT id,@last_date last_date,@last_date := end_date assigned_date
FROM seaservice,
(SELECT @last_date := '1970-01-01') init)t
id | days |
---|---|
1 | 19358 |
2 | 1 |
3 | 1 |
4 | 1 |
5 | 1 |
6 | 1 |
A More Complex Use Case For Variables
For a demonstration of how variables can be used in a more complex scenario, I want to run a report that shows me the start and end dates of days that a user worked consecutively, on the same vessel, in the same position.
If they worked a different vessel or worked a different position, or there was a gap of days, then it will need to be a separate row.
The end result for the data entered earlier should be
days | start_date | end_date | vessel | position |
---|---|---|---|---|
2 | 2023-01-01 00:00:00 | 2023-01-02 00:00:00 | 1 | 1 |
1 | 2023-01-03 00:00:00 | 2023-01-03 00:00:00 | 2 | 1 |
2 | 2023-01-04 00:00:00 | 2023-01-05 00:00:00 | 1 | 2 |
1 | 2023-01-06 00:00:00 | 2023-01-06 00:00:00 | 2 | 1 |
To achieve this we would need to keep track of the date, the vessel and the position, so we can compare them all to the next row to see if they really did work the same vessel and same position the next day.
Start building the query by first adding those variables in and making sure they are returning correct values.
SELECT last_type,last_position,last_date,start_date,
end_date, vessel, position
FROM (
SELECT @last_type last_type,@last_position last_position ,
@last_date last_date,start_date,
end_date,position, vessel,
@last_type := vessel,
@last_position := position,
@last_date := end_date
FROM seaservice, (
SELECT
@last_type := NULL,
@last_position := NULL,
@last_date := NULL
) init
) t
last_type | last_position | last_date | start_date | end_date | vessel | position |
---|---|---|---|---|---|---|
NULL | NULL | NULL | 2023-01-01 00:00:00 | 2023-01-01 00:00:00 | 1 | 1 |
1 | 1 | 2023-01-01 00:00:00 | 2023-01-02 00:00:00 | 2023-01-02 00:00:00 | 1 | 1 |
1 | 1 | 2023-01-02 00:00:00 | 2023-01-03 00:00:00 | 2023-01-03 00:00:00 | 2 | 1 |
2 | 1 | 2023-01-03 00:00:00 | 2023-01-04 00:00:00 | 2023-01-04 00:00:00 | 1 | 2 |
1 | 2 | 2023-01-04 00:00:00 | 2023-01-05 00:00:00 | 2023-01-05 00:00:00 | 1 | 2 |
1 | 2 | 2023-01-05 00:00:00 | 2023-01-06 00:00:00 | 2023-01-06 00:00:00 | 2 | 1 |
Now that the data looks like it is correct, we need a way to compare and group like items together.
For this we can add an additional variable (@group) to store an ID that could later be used to group items together.
SELECT group_by,last_type,last_position,last_date,start_date,
end_date, vessel, position FROM (
SELECT @group group_by,@last_type last_type,
@last_position last_position ,@last_date last_date,
start_date, end_date,position, vessel,
@group := @group + 1,
@last_type := vessel,
@last_position := position,
@last_date := end_date
FROM seaservice, (
SELECT
@group := 0,
@last_type := NULL,
@last_position := NULL,
@last_date := NULL
) init
) t
group_by | last_type | last_position | last_date | start_date | end_date | vessel | position |
---|---|---|---|---|---|---|---|
0 | NULL | NULL | NULL | 2023-01-01 00:00:00 | 2023-01-01 00:00:00 | 1 | 1 |
1 | 1 | 1 | 2023-01-01 00:00:00 | 2023-01-02 00:00:00 | 2023-01-02 00:00:00 | 1 | 1 |
2 | 1 | 1 | 2023-01-02 00:00:00 | 2023-01-03 00:00:00 | 2023-01-03 00:00:00 | 2 | 1 |
3 | 2 | 1 | 2023-01-03 00:00:00 | 2023-01-04 00:00:00 | 2023-01-04 00:00:00 | 1 | 2 |
4 | 1 | 2 | 2023-01-04 00:00:00 | 2023-01-05 00:00:00 | 2023-01-05 00:00:00 | 1 | 2 |
5 | 1 | 2 | 2023-01-05 00:00:00 | 2023-01-06 00:00:00 | 2023-01-06 00:00:00 | 2 | 1 |
The @group can now be used to add a comparison. If the vessel, position and consecutive dates match up, the group ID stays the same. If they don't match , increment the Group ID.
To do this, the comparisons needs to be added to the @group assignment
@group := @group + 1 - (
vessel <=> @last_type
AND position <=> @last_position
AND start_date <=> @last_date + INTERVAL 1 DAY
) group_by
The comparisons in the parentheses will return 1 if all three comparisons are true and 0 if any of the comparisons are false, which will increment the group ID.
And now we can group by group_by. We can also return the lowest start_date in the group and the highest end_date in the group to get the date range they worked, and calculate the number of days in a row they worked on the same vessel and same position.
The final query will look like:
SELECT DATEDIFF(max(end_date),min(start_date))+ 1 as days,
MIN(start_date) as start_date, MAX(end_date) as end_date,
vessel, position FROM (
SELECT start_date, end_date,position, vessel,
@group := @group + 1 - (
vessel <=> @last_type
AND position <=> @last_item
AND start_date <=> @last_date + INTERVAL 1 DAY
) group_by,
@last_type := vessel,
@last_item := position,
@last_date := end_date
FROM seaservice, (
SELECT @group := 0,
@last_type := NULL,
@last_item := NULL,
@last_date := NULL
) init
) t GROUP BY group_by
Which will return
days | start_date | end_date | vessel | position |
---|---|---|---|---|
2 | 2023-01-01 00:00:00 | 2023-01-02 00:00:00 | 1 | 1 |
1 | 2023-01-03 00:00:00 | 2023-01-03 00:00:00 | 2 | 1 |
2 | 2023-01-04 00:00:00 | 2023-01-05 00:00:00 | 1 | 2 |
1 | 2023-01-06 00:00:00 | 2023-01-06 00:00:00 | 2 | 1 |
Depending on your data, you may need to add ORDER BY to ensure the rows are returning in an order that makes sense for your query.
Top comments (0)