Group BY
Group BY is used to summarize the table data. As the name implies it will group the set of similar data in a column. Grouping may be based on only one column or multiple columns. Below examples illustrate this:
CREATE TABLE dbo.Employee
In the above fig we can see that EmpDept has a set of similar values as highlighted in read box : FIN,HR,IT.
Single Column Grouping :
Below query returns Number of Employees in each department.
Working:
Below query returns the Number of Employees in each dept with the following condition:
Multiple Column Grouping :
SELECT EmpDept,MAX(EmpSal) AS MaxSal
SELECT DISTINCT COUNT(EmpDept) FROM dbo.Employee
SELECT COUNT(EmpLoc) FROM dbo.Employee
select COUNT(*) from dbo.TestCnt -----> 7
select COUNT(*) from dbo.TestCnt1 -----> 7
Group BY is used to summarize the table data. As the name implies it will group the set of similar data in a column. Grouping may be based on only one column or multiple columns. Below examples illustrate this:
CREATE TABLE dbo.Employee
(
EmpID VARCHAR(5),
EmpName VARCHAR(40),
EmpDept VARCHAR(10),
EmpLoc VARCHAR(40),
EmpSal MONEY
)
INSERT INTO dbo.Employee VALUES('E001','AAA','HR','Delhi',20000)
INSERT INTO dbo.Employee VALUES('E002','BBB','IT','Bangalore',30000)
INSERT INTO dbo.Employee VALUES('E003','CCC','IT','Delhi',15000)
INSERT INTO dbo.Employee VALUES('E004','AAA','HR','Bangalore',10000)
INSERT INTO dbo.Employee VALUES('E005','GGG','FIN','Hydrabad',25000)
INSERT INTO dbo.Employee VALUES('E006','YYY','FIN','Hydrabad',26000)
INSERT INTO dbo.Employee VALUES('E007','SSS','IT','Goa',40000)
INSERT INTO dbo.Employee VALUES('E008','WWW','IT','HP',26000)
INSERT INTO dbo.Employee VALUES('E009','WWW','FIN','HP',80000)
INSERT INTO dbo.Employee VALUES('E010','QQQ','FIN','Gujrath',13000)
SELECT * FROM dbo.Employee
SELECT * FROM dbo.Employee ORDER BY EmpDept
In the above fig we can see that EmpDept has a set of similar values as highlighted in read box : FIN,HR,IT.
Single Column Grouping :
Below query returns Number of Employees in each department.
SELECT EmpDept,COUNT(*) AS NoOfEmp
FROM dbo.Employee
GROUP BY EmpDept
ORDER BY EmpDept
Below query returns Number of Employees in each department with the following condition:
- A dept should have more than 2 employees, only those records should be displayed.
SELECT EmpDept,COUNT(*) AS NoOfEmp
FROM dbo.Employee
GROUP BY EmpDept
HAVING COUNT(*)>2
ORDER BY EmpDept
Working:
Below query returns the Number of Employees in each dept with the following condition:
- A dept should have more than 2 employees, only those records should be displayed.
- Except FIN dept all other dept should be displayed.
SELECT EmpDept,COUNT(*) AS NoOfEmp
FROM dbo.Employee
WHERE EmpDept<>'FIN'
GROUP BY EmpDept
HAVING COUNT(*)>2
ORDER BY EmpDept
Multiple Column Grouping :
Below query is the example for multiple column grouping
which will return, in a dept how many employees are from same places. In below
example we can see, from FIN dept from Hydrabad there are 2 employees and rest
all have 1 employee.
SELECT EmpDept,EmpLoc,COUNT(*) AS NoOfEmp
FROM dbo.Employee
GROUP BY EmpDept,EmpLoc
ORDER BY EmpDept
Working:
Other Examples :
Similarly instead of count we can use aggregate functions like MIN(), MAX(), AVG(), SUM() etc.
Below queries returns Minimum and Maximum salary in each department
SELECT EmpDept,MIN(EmpSal) AS MinSal
FROM dbo.Employee
GROUP BY EmpDept
SELECT EmpDept,MAX(EmpSal) AS MaxSal
FROM dbo.Employee
GROUP BY EmpDept
Below query returns average salary in each department.
SELECT EmpDept,AVG(EmpSal) AS AvgSal
FROM dbo.Employee
GROUP BY EmpDept
Below query returns total salary paid in each department.
SELECT EmpDept,SUM(EmpSal) AS SumSal
FROM dbo.Employee
GROUP BY EmpDept
HAVING and WHERE
People say that HAVING cannot exists without GROUP BY, but this is wrong. HAVING can exists without GROUP BY. When HAVING alone is used its compulsory to use aggregate function. In the absence of GROUP BY HAVING will group entire table as one and applies aggregate function.
WHERE is also a filter condition and HAVING is also a filter condition. WHERE applies filter on single record, HAVING applies filter on grouped data. In where
clause we cannot use aggregate functions but in having clause we can use
aggregate function.
Having works
like Where clause with out Group By Clause.
Below query
will return the table count if HAVING condition is satisfied or else no output
is displayed. Here since HAVING exists without GROUP BY entire table is grouped
as one . Since Avg of EmpSal in the entire table is 28500 ((20000.00 +
30000.00 + 15000.00 +
10000.00 + 25000.00 +
26000.00 + 40000.00 +
26000.00 + 80000.00 +
13000.00)/10) and in the Having clause
it is mentioned that if this avg is greater than 28000 ,output 10 is
displayed which is the count of table records. It is something like, if HAVING condition is true it will return the select statement output. Similarly you can
find the one for average also.
SELECT COUNT(*)
FROM dbo.Employee
HAVING AVG(EmpSal)>28000
SELECT AVG(EmpSal)
FROM dbo.Employee
HAVING AVG(EmpSal)>28000
Distinct/Count
Distinct will remove the duplicate records in the output as shown in below example:
SELECT DISTINCT EmpDept FROM
dbo.Employee
Below query first it will
take the count of EmpDept values and then applies distinct on the result
obtained. First it will return 10 and DISTINCT 10 is always 10.
Below query first it will
take the distinct values and then applies count on the result obtained. First it will return the distinct EmpDept which is 3 records IT,FIN,HR then COUNT of obtained result is 3.
SELECT COUNT(DISTINCT EmpDept) FROM dbo.Employee
Lets see DISTINCT and COUNT with NULL's and BLANK's in a column
UPDATE dbo.Employee
SET EmpLoc=''
WHERE EmpID='E006'
UPDATE dbo.Employee
SET EmpLoc=NULL
WHERE EmpID='E007'
SELECT DISTINCT EmpLoc FROM
dbo.Employee
SELECT COUNT(EmpLoc) FROM dbo.Employee
SELECT COUNT(*) FROM dbo.Employee
SELECT COUNT(EmpLoc) AS EmpLocCnt,COUNT(EmpSal) EmpSalCnt FROM dbo.Employee
- DISTINCT will consider both NULL as well as BLANK values. NULL is nothing whereas BLANK is a string of zero characters.
- COUNT will not consider NULL values but it considers BLANK values.
- “*” represents all the columns in the table. So when COUNT(*) is specified the count is taken on group of columns. So irrespective of one column EmpLoc it will consider all columns and displays count as 12.
CREATE TABLE dbo.TestCnt
(
Col1 varchar(max),
Col2 varchar(max),
Col3 varchar(max),
Col4 varchar(max)
)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt values(NULL,NULL,NULL,NULL)
select * from dbo.TestCnt
select COUNT(*) from dbo.TestCnt -----> 7
select COUNT(1) from dbo.TestCnt -----> 7
select COUNT(Col1) from dbo.TestCnt -----> 0
select COUNT(Col2) from dbo.TestCnt -----> 0
select COUNT(Col1,Col2) from dbo.TestCnt -----> Error
select COUNT(Col1),COUNT(Col2) from dbo.TestCnt -----> 0,0
select COUNT(1),COUNT(Col2) from dbo.TestCnt -----> 7,0
select COUNT(1),COUNT(*) from dbo.TestCnt -----> 7,7
select COUNT(*),COUNT(*) from dbo.TestCnt -----> 7,7
CREATE TABLE dbo.TestCnt1
(
Col1 varchar(max),
Col2 varchar(max),
Col3 varchar(max),
Col4 varchar(max)
)
insert into dbo.TestCnt1 values('',NULL,NULL,NULL)
insert into dbo.TestCnt1 values(NULL,'',NULL,NULL)
insert into dbo.TestCnt1 values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt1 values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt1 values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt1 values(NULL,NULL,NULL,NULL)
insert into dbo.TestCnt1 values(NULL,'',NULL,NULL)
select * from dbo.TestCnt1
select COUNT(*) from dbo.TestCnt1 -----> 7
select COUNT(1) from dbo.TestCnt1 -----> 7
select COUNT(Col1) from dbo.TestCnt1 -----> 1
select COUNT(Col2) from dbo.TestCnt1 -----> 2
select COUNT(Col1,Col2) from dbo.TestCnt1 -----> Error
select COUNT(Col1),COUNT(Col2) from dbo.TestCnt1 -----> 1,2
select COUNT(1),COUNT(Col2) from dbo.TestCnt1 -----> 7,2
select COUNT(1),COUNT(*) from dbo.TestCnt1 -----> 7,7
select COUNT(*),COUNT(*) from dbo.TestCnt1 -----> 7,7
No comments:
Post a Comment