Can't use SUM in SQL when the value is not integer
I am using below SQL in SNOWFLAKE and couldn't able to get the count of the instance. The end goal that i am trying to achieve is that i want to know that count of teams and instances that have the same the same Issue.
My SQL query:
SELECT
priority,
QID,
issue,
solution,
team,
instance,
COUNT(team) AS TEAM_COUNT,
SUM(instance) AS Instance_count
FROM
ABCDTABLE
WHERE
priority != 'NONE'
GROUP_BY
priority, QID, issue, solution, team, instance
ORDER_BY
priority, QID, issue, solution, team, instance
I get this error:
Numeric value 'i-21232' is not recognized
Sample data:
Priority QID Issue SOLUTION TEAM INSTANCE
HIGH 123 Wrong package version Update Package. Devops. i-21232
HIGH 123 wrong package version update package. infra. i-12341
MEDIUM 721. SSH vulnerability. Remove old version. App i-323232
Expected Output:
Priority QID. Issue Solution count(TEAM) COUNT(INSTANCE)
HIGH. 123. Wrong Package Version. Update package. 2 2
Thank You
You may need to remove the TEAM and INSTANCE columns from the GROUP BY, and change the SUM(INSTANCE) to a COUNT(INSTANCE).
As stated above, you can't SUM data the contains non-numerical values.
E.g. does the below work for you?
SELECT
priority,
QID,
issue,
solution,
team,
instance,
COUNT(team) AS TEAM_COUNT,
COUNT(instance) AS Instance_count
FROM
ABCDTABLE
WHERE
priority != 'NONE'
GROUP BY
priority, QID, issue, solution
ORDER BY
priority, QID, issue, solution, team, instance
So if we take the pasted SQL and data. then fixed the GROUP BY
and ORDER BY
and change SUM(instance)
to COUNT(instance)
WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
SELECT * FROM VALUES
('HIGH', 123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
('HIGH', 123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT
priority,
QID,
issue,
solution,
team,
instance,
COUNT(team) AS TEAM_COUNT,
COUNT(instance) AS Instance_count
FROM
ABCDTABLE
WHERE
priority != 'NONE'
GROUP BY
priority, QID, issue, solution, team, instance
ORDER BY
priority, QID, issue, solution, team, instance
we get some perfectly valid SQL, which returns:
PRIORITY | QID | ISSUE | SOLUTION | TEAM | INSTANCE | TEAM_COUNT | INSTANCE_COUNT |
---|---|---|---|---|---|---|---|
HIGH | 123 | Wrong package version | Update Package. | Devops. | i-21232 | 1 | 1 |
HIGH | 123 | wrong package version | update package. | infra. | i-12341 | 1 | 1 |
MEDIUM | 721 | SSH vulnerability. | Remove old version. | App | i-323232 | 1 | 1 |
which is not what you want.
So like everybody else, removing TEAM
& INSTANCE
from the selection and groupings would give better results.
But you input data is clearly inconsistent. so to lowering will improve that, but then our group by should be swapped to the output column references 1,2,3,4
WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
SELECT * FROM VALUES
('HIGH', 123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
('HIGH', 123, 'wrong package version', 'Update Package.', 'infra.', 'i-12341'),
('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT
lower(priority) as priority,
QID,
lower(issue) as issue,
lower(solution) as solution,
COUNT(team) AS TEAM_COUNT,
COUNT(instance) AS Instance_count
FROM ABCDTABLE
WHERE priority != 'NONE'
GROUP BY 1,2,3,4
ORDER BY 1,2,3,4
giving:
PRIORITY | QID | ISSUE | SOLUTION | TEAM_COUNT | INSTANCE_COUNT |
---|---|---|---|---|---|
high | 123 | wrong package version | update package. | 2 | 2 |
medium | 721 | ssh vulnerability. | remove old version. | 1 | 1 |
And then as Chris notes, this has the row with a count of one. So you ether made a typo in your example data.
which can be achieved by adding a HAVING TEAM_COUNT > 1
after the GROUP BY clause. You can reuse a already named output value like I have, or count all rows like Chris notes with count(*)
Then there is one point you make in a comment:
the reason i want to use SUM for the instance is that i want to get the total count of the instances that are having the same issue. If i use count on instance, it doesn't show total count.
I translate that into you want the distinct count, which if we boost the data to:
SELECT * FROM VALUES
('HIGH', 123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
('HIGH', 123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
('HIGH', 123, 'wrong package version', 'update package.', 'other', 'i-12341'),
('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
I read that as your want 2, not 3, thus COUNT(DISTINCT(instance))
can be used.
WITH ABCDTABLE(Priority, QID, Issue, SOLUTION,TEAM, INSTANCE) AS (
SELECT * FROM VALUES
('HIGH', 123, 'Wrong package version', 'Update Package.', 'Devops.', 'i-21232'),
('HIGH', 123, 'wrong package version', 'update package.', 'infra.', 'i-12341'),
('HIGH', 123, 'wrong package version', 'update package.', 'other', 'i-12341'),
('MEDIUM',721, 'SSH vulnerability.', 'Remove old version.', 'App', 'i-323232')
)
SELECT
lower(priority) as priority,
QID,
lower(issue) as issue,
lower(solution) as solution,
COUNT(team) AS TEAM_COUNT,
COUNT(distinct instance) AS Instance_count
FROM ABCDTABLE
WHERE priority != 'NONE'
GROUP BY 1,2,3,4
HAVING TEAM_COUNT > 1
ORDER BY 1,2,3,4
gives:
PRIORITY | QID | ISSUE | SOLUTION | TEAM_COUNT | INSTANCE_COUNT |
---|---|---|---|---|---|
high | 123 | wrong package version | update package. | 3 | 2 |