Natural (human alpha-numeric) sort in Microsoft SQL Server
up vote
5
down vote
favorite
Thank you for taking time to read all this, its a lot! Appreciate all you fellow enthusiasts!
How to natural sort?
ie. order a set of alpha numeric data to appear as:
Season 1, Season 2, Season 10, Season 20
instead of
Season 1, Season 10, Season 2, Season 20
I use a very practical example of tv seasons in a very practical format as case.
I am looking to accomplish the following:
- Share my working solution for others
- Ask your help in figuring how to shorten it (or find better solution) to my solution
- Can you solve criteria 7 below?
I spent about 2 hours researching online and another 3 hours building this solution. Some of the reference material came from:
- SO Post
- MSDN
- Essential SQL
- Code Project
- DBA Stack Exchange
Some of the solutions found on SO and other sites only work for 90% of cases. However, most/all do NOT work if you have multiple numeric values in your text, or will cause SQL error if there isn't a number found in the text at all.
I have created this SQLFiddle link to play around with (includes all below code).
Here is the create statement:
create table tvseason
(
title varchar(100)
);
insert into tvseason (title)
values ('100 Season 03'), ('100 Season 1'),
('100 Season 10'), ('100 Season 2'),
('100 Season 4'), ('Show Season 1 (2008)'),
('Show Season 2 (2008)'), ('Show Season 10 (2008)'),
('Another Season 01'), ('Another Season 02'),
('Another 1st Anniversary Season 01'),
('Another 2nd Anniversary Season 01'),
('Another 10th Anniversary Season 01'),
('Some Show Another No Season Number'),
('Some Show No Season Number'),
('Show 2 Season 1'),
('Some Show With Season Number 1'),
('Some Show With Season Number 2'),
('Some Show With Season Number 10');
Here is my working solution (only unable to solve criteria #7 below):
select
title, "index", titleLeft,
convert(int, coalesce(nullif(titleRightTrim2, ''), titleRight)) titleRight
from
(select
title, "index", titleLeft, titleRight, titleRightTrim1,
case
when PATINDEX('%[^0-9]%', titleRightTrim2) = 0
then titleRightTrim2
else left(titleRightTrim2, PATINDEX('%[^0-9]%', titleRightTrim2) - 1)
end as titleRightTrim2
from
(select
title,
len(title) - PATINDEX('%[0-9] %', reverse(title)) 'index',
left(title, len(title) - PATINDEX('%[0-9] %', reverse(title))) titleLeft,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRight,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRightTrim1,
left(ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))), PATINDEX('% %', ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))))) titleRightTrim2
from
tvseason) x) y
order by
titleLeft, titleRight
Criteria to consider:
- Text contains no numbers
- Text contains numbers at beginning and end
- Text contains numbers at beginning only
- Text contains numbers at end only
- Text may contain (YYYY) at end
- Text may end with single digit OR double digit (ex. 1 or 01)
- Optional: Any combination of above, plus numbers in middle of text
Here is the output:
title
100 Season 1
100 Season 2
100 Season 03
100 Season 4
100 Season 10
**Case 7 here**
Another 10th Anniversary Season 01
Another 1st Anniversary Season 01
Another 2nd Anniversary Season 01
Another Season 01
Another Season 02
Show (2008) Season 1
Show (2008) Season 2
Show 2 The 75th Anniversary Season 1
Show Season 1 (2008)
Show Season 2 (2008)
Show Season 10 (2008)
Some Show Another No Season Number
Some Show No Season Number
Some Show With Season Number 1
Some Show With Season Number 2
Some Show With Season Number 10
sql sql-server sorting natural-sort
add a comment |
up vote
5
down vote
favorite
Thank you for taking time to read all this, its a lot! Appreciate all you fellow enthusiasts!
How to natural sort?
ie. order a set of alpha numeric data to appear as:
Season 1, Season 2, Season 10, Season 20
instead of
Season 1, Season 10, Season 2, Season 20
I use a very practical example of tv seasons in a very practical format as case.
I am looking to accomplish the following:
- Share my working solution for others
- Ask your help in figuring how to shorten it (or find better solution) to my solution
- Can you solve criteria 7 below?
I spent about 2 hours researching online and another 3 hours building this solution. Some of the reference material came from:
- SO Post
- MSDN
- Essential SQL
- Code Project
- DBA Stack Exchange
Some of the solutions found on SO and other sites only work for 90% of cases. However, most/all do NOT work if you have multiple numeric values in your text, or will cause SQL error if there isn't a number found in the text at all.
I have created this SQLFiddle link to play around with (includes all below code).
Here is the create statement:
create table tvseason
(
title varchar(100)
);
insert into tvseason (title)
values ('100 Season 03'), ('100 Season 1'),
('100 Season 10'), ('100 Season 2'),
('100 Season 4'), ('Show Season 1 (2008)'),
('Show Season 2 (2008)'), ('Show Season 10 (2008)'),
('Another Season 01'), ('Another Season 02'),
('Another 1st Anniversary Season 01'),
('Another 2nd Anniversary Season 01'),
('Another 10th Anniversary Season 01'),
('Some Show Another No Season Number'),
('Some Show No Season Number'),
('Show 2 Season 1'),
('Some Show With Season Number 1'),
('Some Show With Season Number 2'),
('Some Show With Season Number 10');
Here is my working solution (only unable to solve criteria #7 below):
select
title, "index", titleLeft,
convert(int, coalesce(nullif(titleRightTrim2, ''), titleRight)) titleRight
from
(select
title, "index", titleLeft, titleRight, titleRightTrim1,
case
when PATINDEX('%[^0-9]%', titleRightTrim2) = 0
then titleRightTrim2
else left(titleRightTrim2, PATINDEX('%[^0-9]%', titleRightTrim2) - 1)
end as titleRightTrim2
from
(select
title,
len(title) - PATINDEX('%[0-9] %', reverse(title)) 'index',
left(title, len(title) - PATINDEX('%[0-9] %', reverse(title))) titleLeft,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRight,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRightTrim1,
left(ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))), PATINDEX('% %', ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))))) titleRightTrim2
from
tvseason) x) y
order by
titleLeft, titleRight
Criteria to consider:
- Text contains no numbers
- Text contains numbers at beginning and end
- Text contains numbers at beginning only
- Text contains numbers at end only
- Text may contain (YYYY) at end
- Text may end with single digit OR double digit (ex. 1 or 01)
- Optional: Any combination of above, plus numbers in middle of text
Here is the output:
title
100 Season 1
100 Season 2
100 Season 03
100 Season 4
100 Season 10
**Case 7 here**
Another 10th Anniversary Season 01
Another 1st Anniversary Season 01
Another 2nd Anniversary Season 01
Another Season 01
Another Season 02
Show (2008) Season 1
Show (2008) Season 2
Show 2 The 75th Anniversary Season 1
Show Season 1 (2008)
Show Season 2 (2008)
Show Season 10 (2008)
Some Show Another No Season Number
Some Show No Season Number
Some Show With Season Number 1
Some Show With Season Number 2
Some Show With Season Number 10
sql sql-server sorting natural-sort
add a comment |
up vote
5
down vote
favorite
up vote
5
down vote
favorite
Thank you for taking time to read all this, its a lot! Appreciate all you fellow enthusiasts!
How to natural sort?
ie. order a set of alpha numeric data to appear as:
Season 1, Season 2, Season 10, Season 20
instead of
Season 1, Season 10, Season 2, Season 20
I use a very practical example of tv seasons in a very practical format as case.
I am looking to accomplish the following:
- Share my working solution for others
- Ask your help in figuring how to shorten it (or find better solution) to my solution
- Can you solve criteria 7 below?
I spent about 2 hours researching online and another 3 hours building this solution. Some of the reference material came from:
- SO Post
- MSDN
- Essential SQL
- Code Project
- DBA Stack Exchange
Some of the solutions found on SO and other sites only work for 90% of cases. However, most/all do NOT work if you have multiple numeric values in your text, or will cause SQL error if there isn't a number found in the text at all.
I have created this SQLFiddle link to play around with (includes all below code).
Here is the create statement:
create table tvseason
(
title varchar(100)
);
insert into tvseason (title)
values ('100 Season 03'), ('100 Season 1'),
('100 Season 10'), ('100 Season 2'),
('100 Season 4'), ('Show Season 1 (2008)'),
('Show Season 2 (2008)'), ('Show Season 10 (2008)'),
('Another Season 01'), ('Another Season 02'),
('Another 1st Anniversary Season 01'),
('Another 2nd Anniversary Season 01'),
('Another 10th Anniversary Season 01'),
('Some Show Another No Season Number'),
('Some Show No Season Number'),
('Show 2 Season 1'),
('Some Show With Season Number 1'),
('Some Show With Season Number 2'),
('Some Show With Season Number 10');
Here is my working solution (only unable to solve criteria #7 below):
select
title, "index", titleLeft,
convert(int, coalesce(nullif(titleRightTrim2, ''), titleRight)) titleRight
from
(select
title, "index", titleLeft, titleRight, titleRightTrim1,
case
when PATINDEX('%[^0-9]%', titleRightTrim2) = 0
then titleRightTrim2
else left(titleRightTrim2, PATINDEX('%[^0-9]%', titleRightTrim2) - 1)
end as titleRightTrim2
from
(select
title,
len(title) - PATINDEX('%[0-9] %', reverse(title)) 'index',
left(title, len(title) - PATINDEX('%[0-9] %', reverse(title))) titleLeft,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRight,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRightTrim1,
left(ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))), PATINDEX('% %', ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))))) titleRightTrim2
from
tvseason) x) y
order by
titleLeft, titleRight
Criteria to consider:
- Text contains no numbers
- Text contains numbers at beginning and end
- Text contains numbers at beginning only
- Text contains numbers at end only
- Text may contain (YYYY) at end
- Text may end with single digit OR double digit (ex. 1 or 01)
- Optional: Any combination of above, plus numbers in middle of text
Here is the output:
title
100 Season 1
100 Season 2
100 Season 03
100 Season 4
100 Season 10
**Case 7 here**
Another 10th Anniversary Season 01
Another 1st Anniversary Season 01
Another 2nd Anniversary Season 01
Another Season 01
Another Season 02
Show (2008) Season 1
Show (2008) Season 2
Show 2 The 75th Anniversary Season 1
Show Season 1 (2008)
Show Season 2 (2008)
Show Season 10 (2008)
Some Show Another No Season Number
Some Show No Season Number
Some Show With Season Number 1
Some Show With Season Number 2
Some Show With Season Number 10
sql sql-server sorting natural-sort
Thank you for taking time to read all this, its a lot! Appreciate all you fellow enthusiasts!
How to natural sort?
ie. order a set of alpha numeric data to appear as:
Season 1, Season 2, Season 10, Season 20
instead of
Season 1, Season 10, Season 2, Season 20
I use a very practical example of tv seasons in a very practical format as case.
I am looking to accomplish the following:
- Share my working solution for others
- Ask your help in figuring how to shorten it (or find better solution) to my solution
- Can you solve criteria 7 below?
I spent about 2 hours researching online and another 3 hours building this solution. Some of the reference material came from:
- SO Post
- MSDN
- Essential SQL
- Code Project
- DBA Stack Exchange
Some of the solutions found on SO and other sites only work for 90% of cases. However, most/all do NOT work if you have multiple numeric values in your text, or will cause SQL error if there isn't a number found in the text at all.
I have created this SQLFiddle link to play around with (includes all below code).
Here is the create statement:
create table tvseason
(
title varchar(100)
);
insert into tvseason (title)
values ('100 Season 03'), ('100 Season 1'),
('100 Season 10'), ('100 Season 2'),
('100 Season 4'), ('Show Season 1 (2008)'),
('Show Season 2 (2008)'), ('Show Season 10 (2008)'),
('Another Season 01'), ('Another Season 02'),
('Another 1st Anniversary Season 01'),
('Another 2nd Anniversary Season 01'),
('Another 10th Anniversary Season 01'),
('Some Show Another No Season Number'),
('Some Show No Season Number'),
('Show 2 Season 1'),
('Some Show With Season Number 1'),
('Some Show With Season Number 2'),
('Some Show With Season Number 10');
Here is my working solution (only unable to solve criteria #7 below):
select
title, "index", titleLeft,
convert(int, coalesce(nullif(titleRightTrim2, ''), titleRight)) titleRight
from
(select
title, "index", titleLeft, titleRight, titleRightTrim1,
case
when PATINDEX('%[^0-9]%', titleRightTrim2) = 0
then titleRightTrim2
else left(titleRightTrim2, PATINDEX('%[^0-9]%', titleRightTrim2) - 1)
end as titleRightTrim2
from
(select
title,
len(title) - PATINDEX('%[0-9] %', reverse(title)) 'index',
left(title, len(title) - PATINDEX('%[0-9] %', reverse(title))) titleLeft,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRight,
ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))) titleRightTrim1,
left(ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))), PATINDEX('% %', ltrim(right(title, PATINDEX('%[0-9] %', reverse(title)))))) titleRightTrim2
from
tvseason) x) y
order by
titleLeft, titleRight
Criteria to consider:
- Text contains no numbers
- Text contains numbers at beginning and end
- Text contains numbers at beginning only
- Text contains numbers at end only
- Text may contain (YYYY) at end
- Text may end with single digit OR double digit (ex. 1 or 01)
- Optional: Any combination of above, plus numbers in middle of text
Here is the output:
title
100 Season 1
100 Season 2
100 Season 03
100 Season 4
100 Season 10
**Case 7 here**
Another 10th Anniversary Season 01
Another 1st Anniversary Season 01
Another 2nd Anniversary Season 01
Another Season 01
Another Season 02
Show (2008) Season 1
Show (2008) Season 2
Show 2 The 75th Anniversary Season 1
Show Season 1 (2008)
Show Season 2 (2008)
Show Season 10 (2008)
Some Show Another No Season Number
Some Show No Season Number
Some Show With Season Number 1
Some Show With Season Number 2
Some Show With Season Number 10
sql sql-server sorting natural-sort
sql sql-server sorting natural-sort
edited Mar 20 at 5:57
marc_s
566k12610921245
566k12610921245
asked Mar 20 at 0:33
Kairan
2,075204688
2,075204688
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
3
down vote
I think this will do the trick... I simply recognizes changes from non-numeric to numeric.
I haven't done any large scale testing but It should be reasonably fast.
SET QUOTED_IDENTIFIER ON;
GO
SET ANSI_NULLS ON;
GO
ALTER FUNCTION dbo.tfn_SplitForSort
/* ===================================================================
11/11/2018 JL, Created: Comments
=================================================================== */
--===== Define I/O parameters
(
@string VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (LEN(@string))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
),
cte_split_string AS (
SELECT
col_num = ROW_NUMBER() OVER (ORDER BY t.n) + CASE WHEN LEFT(@string, 1) LIKE '[0-9]' THEN 0 ELSE 1 END,
string_part = SUBSTRING(@string, t.n, LEAD(t.n, 1, 8000) OVER (ORDER BY t.n) - t.n)
FROM
cte_Tally t
CROSS APPLY ( VALUES (SUBSTRING(@string, t.n, 2)) ) s (str2)
WHERE
t.n = 1
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[0-9][^0-9]'
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[^0-9][0-9]'
)
SELECT
so_01 = ISNULL(MAX(CASE WHEN ss.col_num = 1 THEN CONVERT(FLOAT, ss.string_part) END), 99999999),
so_02 = MAX(CASE WHEN ss.col_num = 2 THEN ss.string_part END),
so_03 = MAX(CASE WHEN ss.col_num = 3 THEN CONVERT(FLOAT, ss.string_part) END),
so_04 = MAX(CASE WHEN ss.col_num = 4 THEN ss.string_part END),
so_05 = MAX(CASE WHEN ss.col_num = 5 THEN CONVERT(FLOAT, ss.string_part) END),
so_06 = MAX(CASE WHEN ss.col_num = 6 THEN ss.string_part END),
so_07 = MAX(CASE WHEN ss.col_num = 7 THEN CONVERT(FLOAT, ss.string_part) END),
so_08 = MAX(CASE WHEN ss.col_num = 8 THEN ss.string_part END),
so_09 = MAX(CASE WHEN ss.col_num = 9 THEN CONVERT(FLOAT, ss.string_part) END),
so_10 = MAX(CASE WHEN ss.col_num = 10 THEN ss.string_part END)
FROM
cte_split_string ss;
GO
The function in use...
SELECT
ts.*
FROM
#tvseason ts
CROSS APPLY dbo.tfn_SplitForSort (ts.title) sfs
ORDER BY
sfs.so_01,
sfs.so_02,
sfs.so_03,
sfs.so_04,
sfs.so_05,
sfs.so_06,
sfs.so_07,
sfs.so_08,
sfs.so_09,
sfs.so_10;
Results:
id title
----------- ------------------------------------------
2 100 Season 1
4 100 Season 2
1 100 Season 03
5 100 Season 4
3 100 Season 10
11 Another 1st Anniversary Season 01
12 Another 2nd Anniversary Season 01
13 Another 10th Anniversary Season 01
9 Another Season 01
10 Another Season 02
16 Show 2 Season 1
6 Show Season 1 (2008)
7 Show Season 2 (2008)
8 Show Season 10 (2008)
14 Some Show Another No Season Number
15 Some Show No Season Number
17 Some Show With Season Number 1
18 Some Show With Season Number 2
19 Some Show With Season Number 10
add a comment |
up vote
2
down vote
Personally, I would try to avoid doing complex string manipuluation in SQL. I would probably dump it out to a text file and process it using a regular expression in something like C# or Python. Then write it back to the DB in a separate column. SQL is notoriously bad at string manipulation.
However here's my stab at a SQL approach. The idea is basically to first eliminate any rows which don't have the string Season [number]
in them. That handles the case where there are no seasons to parse. I chose to include them with nulls, but you could just as easily omit them in your where clause, or give them some default value. I use the stuff()
function to cut off everything up to the string Season [number]
, so it's easier to work with.
Now we have the string starting with the season number, and potentially ending in some garbage. I use a case statement to see if there is garbage (anything non-numeric) and if there is, i take the leftmost numeric characters and throw away the rest. If there is only numeric to begin with, I just leave it as it is.
Finally, cast it as an int, and sort by it.
if object_id('tempdb.dbo.#titles') is not null drop table #titles
create table #titles (Title varchar(100))
insert into #titles (TItle)
select title = '100 Season 1'
union all select '100 Season 2'
union all select '100 Season 03'
union all select '100 Season 4'
union all select '100 Season 10'
union all select 'Another 10th Anniversary Season 01'
union all select 'Another 1st Anniversary Season 01'
union all select 'Another 2nd Anniversary Season 01'
union all select 'Another Season 01'
union all select 'Another Season 02'
union all select 'Show (2008) Season 1'
union all select 'Show (2008) Season 2'
union all select 'Show 2 The 75th Anniversary Season 1'
union all select 'Show Season 1 (2008)'
union all select 'Show Season 2 (2008)'
union all select 'Show Season 10 (2008)'
union all select 'Some Show Another No Season Number'
union all select 'Some Show No Season Number'
union all select 'Some Show With Season Number 1'
union all select 'Some Show With Season Number 2'
union all select 'Some Show With Season Number 10'
;with src as
(
select
Title,
Trimmed = case when Title like '%Season [0-9]%'
then stuff(title, 1, patindex('%season [0-9]%', title) + 6, '')
else null
end
from #titles
)
select
Season = cast(case when Trimmed like '%[^0-9]%' then left(Trimmed, patindex('%[^0-9]%', Trimmed))
else Trimmed
end as int),
Title
from src
order by Season
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.
– Xedni
Mar 22 at 18:14
add a comment |
up vote
2
down vote
This question requirement is complex. So it can't be achieved by a simple query. So my solution is below:
First I create a sample data which will be use in this query.
CREATE TABLE #TVSEASON (TITLE VARCHAR(100));
INSERT INTO #TVSEASON (TITLE) VALUES
('100'),
('100 SEASON 03'),
('100 SEASON 1'),
('100 SEASON 10'),
('100 SEASON 2'),
('100 SEASON 4'),
('SHOW (2008) SEASON 1'),
('SHOW (2008) SEASON 2'),
('SHOW SEASON 1 (2008)'),
('SHOW SEASON 2 (2008)'),
('SHOW SEASON 10 (2008)'),
('ANOTHER 1ST ANNIVERSARY SEASON 01'),
('ANOTHER 2ND ANNIVERSARY SEASON 01'),
('ANOTHER 10TH ANNIVERSARY SEASON 01'),
('ANOTHER SEASON 01'),
('ANOTHER SEASON 02'),
('SOME SHOW ANOTHER NO SEASON NUMBER'),
('SOME SHOW NO SEASON NUMBER'),
('SHOW 2 THE 75TH ANNIVERSARY SEASON 1'),
('SOME SHOW WITH SEASON NUMBER 1'),
('SOME SHOW WITH SEASON NUMBER 2'),
('SOME SHOW WITH SEASON NUMBER 10')
For the achieved desired result I create a function for split all words and numbers from the text.
(Note: I also remove st from 1st, nd from 2nd etc through function after trim the spaces between 1 st for safe side if any user mistakely type spaces between 1st, so if you think there is no chance of error then you remove LTRIM from that function, because for removing that values it is also remove th if text has value like "1 the title" which will be convert into 1 e title)
--CREATE SPLIT FUNCTION
CREATE FUNCTION [dbo].[SplitAlphaNumeric]
(
@LIST NVARCHAR(2000)
)
RETURNS @RTNVALUE TABLE
(
ID INT IDENTITY(1,1),
WORDS NVARCHAR(100),
NUMBERS INT
)
AS
BEGIN
WHILE (PATINDEX('%[0-9]%',@LIST) > 0)
BEGIN
INSERT INTO @RTNVALUE (WORDS, NUMBERS)
SELECT CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN @LIST
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN ''
ELSE SUBSTRING(@LIST, 1, PATINDEX('%[0-9]%',@LIST) - 1)
END,
CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN NULL
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN CAST(LTRIM(RTRIM(@LIST)) AS INT)
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN SUBSTRING(@LIST, 1, PATINDEX('%[^0-9]%',@LIST) - 1)
ELSE NULL
END
SET @LIST = LTRIM(RTRIM(CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 OR (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN
CASE WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ST%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ND%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'RD%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'TH%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
ELSE LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))
END
ELSE SUBSTRING(@LIST, PATINDEX('%[0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[0-9]%',REVERSE(@LIST)))
END))
END
INSERT INTO @RTNVALUE (WORDS)
SELECT VALUE = LTRIM(RTRIM(@LIST))
RETURN
END
In third step I use cross apply on calling function because function return table against given string value. On select query I insert all columns into temp table for sort values as per requirement in next step.
SELECT T.TITLE, A.ID, A.NUMBERS, A.WORDS INTO #FINAL
FROM #TVSEASON T
CROSS APPLY dbo.SplitAlphaNumeric(TITLE) A
From the temp table #Final I use stuff for concate all words to make title again without any number occurence in the text, and then use that values to order the title.
You can change that query for order in any sequence like if you want
to order against the text then you order first textval column then
numbers, but if you want to order against summation of all the numbers
which are used in title then order numbers first after sum like I do
or else if you want to order on simple number without sum then don't
use group by clause and subquery and directly order against numbers.
In short you can achieved all the sequences respected to alpha numeric
values after modify that below query and the upper one are the base
query for all the goals.
SELECT A.TITLE--, A.NUMBERS, A.TEXTVAL
FROM (
SELECT A.TITLE,
STUFF((
SELECT ' ' + B.WORDS
FROM #FINAL B
WHERE B.TITLE = A.TITLE
FOR XML PATH(''),TYPE).VALUE('(./TEXT())[1]','VARCHAR(MAX)')
,1,1,'') TEXTVAL,
SUM(ISNULL(A.NUMBERS,0)) NUMBERS
FROM #FINAL A
GROUP BY A.TITLE
) A
ORDER BY A.TEXTVAL, A.NUMBERS
DROP TABLE #FINAL
DROP TABLE #TVSEASON
In last I drops both temp table from memory. I think it is the query for sorting values which you want because if anyone have different order requirement agains alphanumeric values they can achieved their requirement after litle bit modify that query.
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
I think this will do the trick... I simply recognizes changes from non-numeric to numeric.
I haven't done any large scale testing but It should be reasonably fast.
SET QUOTED_IDENTIFIER ON;
GO
SET ANSI_NULLS ON;
GO
ALTER FUNCTION dbo.tfn_SplitForSort
/* ===================================================================
11/11/2018 JL, Created: Comments
=================================================================== */
--===== Define I/O parameters
(
@string VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (LEN(@string))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
),
cte_split_string AS (
SELECT
col_num = ROW_NUMBER() OVER (ORDER BY t.n) + CASE WHEN LEFT(@string, 1) LIKE '[0-9]' THEN 0 ELSE 1 END,
string_part = SUBSTRING(@string, t.n, LEAD(t.n, 1, 8000) OVER (ORDER BY t.n) - t.n)
FROM
cte_Tally t
CROSS APPLY ( VALUES (SUBSTRING(@string, t.n, 2)) ) s (str2)
WHERE
t.n = 1
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[0-9][^0-9]'
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[^0-9][0-9]'
)
SELECT
so_01 = ISNULL(MAX(CASE WHEN ss.col_num = 1 THEN CONVERT(FLOAT, ss.string_part) END), 99999999),
so_02 = MAX(CASE WHEN ss.col_num = 2 THEN ss.string_part END),
so_03 = MAX(CASE WHEN ss.col_num = 3 THEN CONVERT(FLOAT, ss.string_part) END),
so_04 = MAX(CASE WHEN ss.col_num = 4 THEN ss.string_part END),
so_05 = MAX(CASE WHEN ss.col_num = 5 THEN CONVERT(FLOAT, ss.string_part) END),
so_06 = MAX(CASE WHEN ss.col_num = 6 THEN ss.string_part END),
so_07 = MAX(CASE WHEN ss.col_num = 7 THEN CONVERT(FLOAT, ss.string_part) END),
so_08 = MAX(CASE WHEN ss.col_num = 8 THEN ss.string_part END),
so_09 = MAX(CASE WHEN ss.col_num = 9 THEN CONVERT(FLOAT, ss.string_part) END),
so_10 = MAX(CASE WHEN ss.col_num = 10 THEN ss.string_part END)
FROM
cte_split_string ss;
GO
The function in use...
SELECT
ts.*
FROM
#tvseason ts
CROSS APPLY dbo.tfn_SplitForSort (ts.title) sfs
ORDER BY
sfs.so_01,
sfs.so_02,
sfs.so_03,
sfs.so_04,
sfs.so_05,
sfs.so_06,
sfs.so_07,
sfs.so_08,
sfs.so_09,
sfs.so_10;
Results:
id title
----------- ------------------------------------------
2 100 Season 1
4 100 Season 2
1 100 Season 03
5 100 Season 4
3 100 Season 10
11 Another 1st Anniversary Season 01
12 Another 2nd Anniversary Season 01
13 Another 10th Anniversary Season 01
9 Another Season 01
10 Another Season 02
16 Show 2 Season 1
6 Show Season 1 (2008)
7 Show Season 2 (2008)
8 Show Season 10 (2008)
14 Some Show Another No Season Number
15 Some Show No Season Number
17 Some Show With Season Number 1
18 Some Show With Season Number 2
19 Some Show With Season Number 10
add a comment |
up vote
3
down vote
I think this will do the trick... I simply recognizes changes from non-numeric to numeric.
I haven't done any large scale testing but It should be reasonably fast.
SET QUOTED_IDENTIFIER ON;
GO
SET ANSI_NULLS ON;
GO
ALTER FUNCTION dbo.tfn_SplitForSort
/* ===================================================================
11/11/2018 JL, Created: Comments
=================================================================== */
--===== Define I/O parameters
(
@string VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (LEN(@string))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
),
cte_split_string AS (
SELECT
col_num = ROW_NUMBER() OVER (ORDER BY t.n) + CASE WHEN LEFT(@string, 1) LIKE '[0-9]' THEN 0 ELSE 1 END,
string_part = SUBSTRING(@string, t.n, LEAD(t.n, 1, 8000) OVER (ORDER BY t.n) - t.n)
FROM
cte_Tally t
CROSS APPLY ( VALUES (SUBSTRING(@string, t.n, 2)) ) s (str2)
WHERE
t.n = 1
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[0-9][^0-9]'
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[^0-9][0-9]'
)
SELECT
so_01 = ISNULL(MAX(CASE WHEN ss.col_num = 1 THEN CONVERT(FLOAT, ss.string_part) END), 99999999),
so_02 = MAX(CASE WHEN ss.col_num = 2 THEN ss.string_part END),
so_03 = MAX(CASE WHEN ss.col_num = 3 THEN CONVERT(FLOAT, ss.string_part) END),
so_04 = MAX(CASE WHEN ss.col_num = 4 THEN ss.string_part END),
so_05 = MAX(CASE WHEN ss.col_num = 5 THEN CONVERT(FLOAT, ss.string_part) END),
so_06 = MAX(CASE WHEN ss.col_num = 6 THEN ss.string_part END),
so_07 = MAX(CASE WHEN ss.col_num = 7 THEN CONVERT(FLOAT, ss.string_part) END),
so_08 = MAX(CASE WHEN ss.col_num = 8 THEN ss.string_part END),
so_09 = MAX(CASE WHEN ss.col_num = 9 THEN CONVERT(FLOAT, ss.string_part) END),
so_10 = MAX(CASE WHEN ss.col_num = 10 THEN ss.string_part END)
FROM
cte_split_string ss;
GO
The function in use...
SELECT
ts.*
FROM
#tvseason ts
CROSS APPLY dbo.tfn_SplitForSort (ts.title) sfs
ORDER BY
sfs.so_01,
sfs.so_02,
sfs.so_03,
sfs.so_04,
sfs.so_05,
sfs.so_06,
sfs.so_07,
sfs.so_08,
sfs.so_09,
sfs.so_10;
Results:
id title
----------- ------------------------------------------
2 100 Season 1
4 100 Season 2
1 100 Season 03
5 100 Season 4
3 100 Season 10
11 Another 1st Anniversary Season 01
12 Another 2nd Anniversary Season 01
13 Another 10th Anniversary Season 01
9 Another Season 01
10 Another Season 02
16 Show 2 Season 1
6 Show Season 1 (2008)
7 Show Season 2 (2008)
8 Show Season 10 (2008)
14 Some Show Another No Season Number
15 Some Show No Season Number
17 Some Show With Season Number 1
18 Some Show With Season Number 2
19 Some Show With Season Number 10
add a comment |
up vote
3
down vote
up vote
3
down vote
I think this will do the trick... I simply recognizes changes from non-numeric to numeric.
I haven't done any large scale testing but It should be reasonably fast.
SET QUOTED_IDENTIFIER ON;
GO
SET ANSI_NULLS ON;
GO
ALTER FUNCTION dbo.tfn_SplitForSort
/* ===================================================================
11/11/2018 JL, Created: Comments
=================================================================== */
--===== Define I/O parameters
(
@string VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (LEN(@string))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
),
cte_split_string AS (
SELECT
col_num = ROW_NUMBER() OVER (ORDER BY t.n) + CASE WHEN LEFT(@string, 1) LIKE '[0-9]' THEN 0 ELSE 1 END,
string_part = SUBSTRING(@string, t.n, LEAD(t.n, 1, 8000) OVER (ORDER BY t.n) - t.n)
FROM
cte_Tally t
CROSS APPLY ( VALUES (SUBSTRING(@string, t.n, 2)) ) s (str2)
WHERE
t.n = 1
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[0-9][^0-9]'
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[^0-9][0-9]'
)
SELECT
so_01 = ISNULL(MAX(CASE WHEN ss.col_num = 1 THEN CONVERT(FLOAT, ss.string_part) END), 99999999),
so_02 = MAX(CASE WHEN ss.col_num = 2 THEN ss.string_part END),
so_03 = MAX(CASE WHEN ss.col_num = 3 THEN CONVERT(FLOAT, ss.string_part) END),
so_04 = MAX(CASE WHEN ss.col_num = 4 THEN ss.string_part END),
so_05 = MAX(CASE WHEN ss.col_num = 5 THEN CONVERT(FLOAT, ss.string_part) END),
so_06 = MAX(CASE WHEN ss.col_num = 6 THEN ss.string_part END),
so_07 = MAX(CASE WHEN ss.col_num = 7 THEN CONVERT(FLOAT, ss.string_part) END),
so_08 = MAX(CASE WHEN ss.col_num = 8 THEN ss.string_part END),
so_09 = MAX(CASE WHEN ss.col_num = 9 THEN CONVERT(FLOAT, ss.string_part) END),
so_10 = MAX(CASE WHEN ss.col_num = 10 THEN ss.string_part END)
FROM
cte_split_string ss;
GO
The function in use...
SELECT
ts.*
FROM
#tvseason ts
CROSS APPLY dbo.tfn_SplitForSort (ts.title) sfs
ORDER BY
sfs.so_01,
sfs.so_02,
sfs.so_03,
sfs.so_04,
sfs.so_05,
sfs.so_06,
sfs.so_07,
sfs.so_08,
sfs.so_09,
sfs.so_10;
Results:
id title
----------- ------------------------------------------
2 100 Season 1
4 100 Season 2
1 100 Season 03
5 100 Season 4
3 100 Season 10
11 Another 1st Anniversary Season 01
12 Another 2nd Anniversary Season 01
13 Another 10th Anniversary Season 01
9 Another Season 01
10 Another Season 02
16 Show 2 Season 1
6 Show Season 1 (2008)
7 Show Season 2 (2008)
8 Show Season 10 (2008)
14 Some Show Another No Season Number
15 Some Show No Season Number
17 Some Show With Season Number 1
18 Some Show With Season Number 2
19 Some Show With Season Number 10
I think this will do the trick... I simply recognizes changes from non-numeric to numeric.
I haven't done any large scale testing but It should be reasonably fast.
SET QUOTED_IDENTIFIER ON;
GO
SET ANSI_NULLS ON;
GO
ALTER FUNCTION dbo.tfn_SplitForSort
/* ===================================================================
11/11/2018 JL, Created: Comments
=================================================================== */
--===== Define I/O parameters
(
@string VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_Tally (n) AS (
SELECT TOP (LEN(@string))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n2 a CROSS JOIN cte_n2 b
),
cte_split_string AS (
SELECT
col_num = ROW_NUMBER() OVER (ORDER BY t.n) + CASE WHEN LEFT(@string, 1) LIKE '[0-9]' THEN 0 ELSE 1 END,
string_part = SUBSTRING(@string, t.n, LEAD(t.n, 1, 8000) OVER (ORDER BY t.n) - t.n)
FROM
cte_Tally t
CROSS APPLY ( VALUES (SUBSTRING(@string, t.n, 2)) ) s (str2)
WHERE
t.n = 1
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[0-9][^0-9]'
OR SUBSTRING(@string, t.n - 1, 2) LIKE '[^0-9][0-9]'
)
SELECT
so_01 = ISNULL(MAX(CASE WHEN ss.col_num = 1 THEN CONVERT(FLOAT, ss.string_part) END), 99999999),
so_02 = MAX(CASE WHEN ss.col_num = 2 THEN ss.string_part END),
so_03 = MAX(CASE WHEN ss.col_num = 3 THEN CONVERT(FLOAT, ss.string_part) END),
so_04 = MAX(CASE WHEN ss.col_num = 4 THEN ss.string_part END),
so_05 = MAX(CASE WHEN ss.col_num = 5 THEN CONVERT(FLOAT, ss.string_part) END),
so_06 = MAX(CASE WHEN ss.col_num = 6 THEN ss.string_part END),
so_07 = MAX(CASE WHEN ss.col_num = 7 THEN CONVERT(FLOAT, ss.string_part) END),
so_08 = MAX(CASE WHEN ss.col_num = 8 THEN ss.string_part END),
so_09 = MAX(CASE WHEN ss.col_num = 9 THEN CONVERT(FLOAT, ss.string_part) END),
so_10 = MAX(CASE WHEN ss.col_num = 10 THEN ss.string_part END)
FROM
cte_split_string ss;
GO
The function in use...
SELECT
ts.*
FROM
#tvseason ts
CROSS APPLY dbo.tfn_SplitForSort (ts.title) sfs
ORDER BY
sfs.so_01,
sfs.so_02,
sfs.so_03,
sfs.so_04,
sfs.so_05,
sfs.so_06,
sfs.so_07,
sfs.so_08,
sfs.so_09,
sfs.so_10;
Results:
id title
----------- ------------------------------------------
2 100 Season 1
4 100 Season 2
1 100 Season 03
5 100 Season 4
3 100 Season 10
11 Another 1st Anniversary Season 01
12 Another 2nd Anniversary Season 01
13 Another 10th Anniversary Season 01
9 Another Season 01
10 Another Season 02
16 Show 2 Season 1
6 Show Season 1 (2008)
7 Show Season 2 (2008)
8 Show Season 10 (2008)
14 Some Show Another No Season Number
15 Some Show No Season Number
17 Some Show With Season Number 1
18 Some Show With Season Number 2
19 Some Show With Season Number 10
answered Nov 11 at 20:31
Jason A. Long
3,5951412
3,5951412
add a comment |
add a comment |
up vote
2
down vote
Personally, I would try to avoid doing complex string manipuluation in SQL. I would probably dump it out to a text file and process it using a regular expression in something like C# or Python. Then write it back to the DB in a separate column. SQL is notoriously bad at string manipulation.
However here's my stab at a SQL approach. The idea is basically to first eliminate any rows which don't have the string Season [number]
in them. That handles the case where there are no seasons to parse. I chose to include them with nulls, but you could just as easily omit them in your where clause, or give them some default value. I use the stuff()
function to cut off everything up to the string Season [number]
, so it's easier to work with.
Now we have the string starting with the season number, and potentially ending in some garbage. I use a case statement to see if there is garbage (anything non-numeric) and if there is, i take the leftmost numeric characters and throw away the rest. If there is only numeric to begin with, I just leave it as it is.
Finally, cast it as an int, and sort by it.
if object_id('tempdb.dbo.#titles') is not null drop table #titles
create table #titles (Title varchar(100))
insert into #titles (TItle)
select title = '100 Season 1'
union all select '100 Season 2'
union all select '100 Season 03'
union all select '100 Season 4'
union all select '100 Season 10'
union all select 'Another 10th Anniversary Season 01'
union all select 'Another 1st Anniversary Season 01'
union all select 'Another 2nd Anniversary Season 01'
union all select 'Another Season 01'
union all select 'Another Season 02'
union all select 'Show (2008) Season 1'
union all select 'Show (2008) Season 2'
union all select 'Show 2 The 75th Anniversary Season 1'
union all select 'Show Season 1 (2008)'
union all select 'Show Season 2 (2008)'
union all select 'Show Season 10 (2008)'
union all select 'Some Show Another No Season Number'
union all select 'Some Show No Season Number'
union all select 'Some Show With Season Number 1'
union all select 'Some Show With Season Number 2'
union all select 'Some Show With Season Number 10'
;with src as
(
select
Title,
Trimmed = case when Title like '%Season [0-9]%'
then stuff(title, 1, patindex('%season [0-9]%', title) + 6, '')
else null
end
from #titles
)
select
Season = cast(case when Trimmed like '%[^0-9]%' then left(Trimmed, patindex('%[^0-9]%', Trimmed))
else Trimmed
end as int),
Title
from src
order by Season
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.
– Xedni
Mar 22 at 18:14
add a comment |
up vote
2
down vote
Personally, I would try to avoid doing complex string manipuluation in SQL. I would probably dump it out to a text file and process it using a regular expression in something like C# or Python. Then write it back to the DB in a separate column. SQL is notoriously bad at string manipulation.
However here's my stab at a SQL approach. The idea is basically to first eliminate any rows which don't have the string Season [number]
in them. That handles the case where there are no seasons to parse. I chose to include them with nulls, but you could just as easily omit them in your where clause, or give them some default value. I use the stuff()
function to cut off everything up to the string Season [number]
, so it's easier to work with.
Now we have the string starting with the season number, and potentially ending in some garbage. I use a case statement to see if there is garbage (anything non-numeric) and if there is, i take the leftmost numeric characters and throw away the rest. If there is only numeric to begin with, I just leave it as it is.
Finally, cast it as an int, and sort by it.
if object_id('tempdb.dbo.#titles') is not null drop table #titles
create table #titles (Title varchar(100))
insert into #titles (TItle)
select title = '100 Season 1'
union all select '100 Season 2'
union all select '100 Season 03'
union all select '100 Season 4'
union all select '100 Season 10'
union all select 'Another 10th Anniversary Season 01'
union all select 'Another 1st Anniversary Season 01'
union all select 'Another 2nd Anniversary Season 01'
union all select 'Another Season 01'
union all select 'Another Season 02'
union all select 'Show (2008) Season 1'
union all select 'Show (2008) Season 2'
union all select 'Show 2 The 75th Anniversary Season 1'
union all select 'Show Season 1 (2008)'
union all select 'Show Season 2 (2008)'
union all select 'Show Season 10 (2008)'
union all select 'Some Show Another No Season Number'
union all select 'Some Show No Season Number'
union all select 'Some Show With Season Number 1'
union all select 'Some Show With Season Number 2'
union all select 'Some Show With Season Number 10'
;with src as
(
select
Title,
Trimmed = case when Title like '%Season [0-9]%'
then stuff(title, 1, patindex('%season [0-9]%', title) + 6, '')
else null
end
from #titles
)
select
Season = cast(case when Trimmed like '%[^0-9]%' then left(Trimmed, patindex('%[^0-9]%', Trimmed))
else Trimmed
end as int),
Title
from src
order by Season
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.
– Xedni
Mar 22 at 18:14
add a comment |
up vote
2
down vote
up vote
2
down vote
Personally, I would try to avoid doing complex string manipuluation in SQL. I would probably dump it out to a text file and process it using a regular expression in something like C# or Python. Then write it back to the DB in a separate column. SQL is notoriously bad at string manipulation.
However here's my stab at a SQL approach. The idea is basically to first eliminate any rows which don't have the string Season [number]
in them. That handles the case where there are no seasons to parse. I chose to include them with nulls, but you could just as easily omit them in your where clause, or give them some default value. I use the stuff()
function to cut off everything up to the string Season [number]
, so it's easier to work with.
Now we have the string starting with the season number, and potentially ending in some garbage. I use a case statement to see if there is garbage (anything non-numeric) and if there is, i take the leftmost numeric characters and throw away the rest. If there is only numeric to begin with, I just leave it as it is.
Finally, cast it as an int, and sort by it.
if object_id('tempdb.dbo.#titles') is not null drop table #titles
create table #titles (Title varchar(100))
insert into #titles (TItle)
select title = '100 Season 1'
union all select '100 Season 2'
union all select '100 Season 03'
union all select '100 Season 4'
union all select '100 Season 10'
union all select 'Another 10th Anniversary Season 01'
union all select 'Another 1st Anniversary Season 01'
union all select 'Another 2nd Anniversary Season 01'
union all select 'Another Season 01'
union all select 'Another Season 02'
union all select 'Show (2008) Season 1'
union all select 'Show (2008) Season 2'
union all select 'Show 2 The 75th Anniversary Season 1'
union all select 'Show Season 1 (2008)'
union all select 'Show Season 2 (2008)'
union all select 'Show Season 10 (2008)'
union all select 'Some Show Another No Season Number'
union all select 'Some Show No Season Number'
union all select 'Some Show With Season Number 1'
union all select 'Some Show With Season Number 2'
union all select 'Some Show With Season Number 10'
;with src as
(
select
Title,
Trimmed = case when Title like '%Season [0-9]%'
then stuff(title, 1, patindex('%season [0-9]%', title) + 6, '')
else null
end
from #titles
)
select
Season = cast(case when Trimmed like '%[^0-9]%' then left(Trimmed, patindex('%[^0-9]%', Trimmed))
else Trimmed
end as int),
Title
from src
order by Season
Personally, I would try to avoid doing complex string manipuluation in SQL. I would probably dump it out to a text file and process it using a regular expression in something like C# or Python. Then write it back to the DB in a separate column. SQL is notoriously bad at string manipulation.
However here's my stab at a SQL approach. The idea is basically to first eliminate any rows which don't have the string Season [number]
in them. That handles the case where there are no seasons to parse. I chose to include them with nulls, but you could just as easily omit them in your where clause, or give them some default value. I use the stuff()
function to cut off everything up to the string Season [number]
, so it's easier to work with.
Now we have the string starting with the season number, and potentially ending in some garbage. I use a case statement to see if there is garbage (anything non-numeric) and if there is, i take the leftmost numeric characters and throw away the rest. If there is only numeric to begin with, I just leave it as it is.
Finally, cast it as an int, and sort by it.
if object_id('tempdb.dbo.#titles') is not null drop table #titles
create table #titles (Title varchar(100))
insert into #titles (TItle)
select title = '100 Season 1'
union all select '100 Season 2'
union all select '100 Season 03'
union all select '100 Season 4'
union all select '100 Season 10'
union all select 'Another 10th Anniversary Season 01'
union all select 'Another 1st Anniversary Season 01'
union all select 'Another 2nd Anniversary Season 01'
union all select 'Another Season 01'
union all select 'Another Season 02'
union all select 'Show (2008) Season 1'
union all select 'Show (2008) Season 2'
union all select 'Show 2 The 75th Anniversary Season 1'
union all select 'Show Season 1 (2008)'
union all select 'Show Season 2 (2008)'
union all select 'Show Season 10 (2008)'
union all select 'Some Show Another No Season Number'
union all select 'Some Show No Season Number'
union all select 'Some Show With Season Number 1'
union all select 'Some Show With Season Number 2'
union all select 'Some Show With Season Number 10'
;with src as
(
select
Title,
Trimmed = case when Title like '%Season [0-9]%'
then stuff(title, 1, patindex('%season [0-9]%', title) + 6, '')
else null
end
from #titles
)
select
Season = cast(case when Trimmed like '%[^0-9]%' then left(Trimmed, patindex('%[^0-9]%', Trimmed))
else Trimmed
end as int),
Title
from src
order by Season
edited Mar 20 at 1:27
answered Mar 20 at 1:10
Xedni
2,9121819
2,9121819
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.
– Xedni
Mar 22 at 18:14
add a comment |
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.
– Xedni
Mar 22 at 18:14
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
I like the approach to stripping out the season number early using the $season [0-9]. However, wouldn't you also need to store the rest of string that you stripped out (lets call it season name) and do order by season name, season #? If you sort by just the season # you ignore the text before it
– Kairan
Mar 22 at 1:31
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
Also, first time I saw this syntax: ";with src as()" can you explain this to me? is this like a temp table of some sort?
– Kairan
Mar 22 at 1:34
1
1
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.
select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.– Xedni
Mar 22 at 18:14
Sure. That's a Common Table Expression (aka CTE). It works exactly like a derived table in this scenario. The key difference being it has to be declared before the start of the actual statement. So for example, these two statements are identical. The first uses the probably more familiar derived table, and the second uses a CTE.
select * from (select num = 1) a ;with someCTE as (select num = 1) select * from someCTE
. I just sometimes prefer using CTEs rather than derived tables for ease of reading; but they do the same thing.– Xedni
Mar 22 at 18:14
add a comment |
up vote
2
down vote
This question requirement is complex. So it can't be achieved by a simple query. So my solution is below:
First I create a sample data which will be use in this query.
CREATE TABLE #TVSEASON (TITLE VARCHAR(100));
INSERT INTO #TVSEASON (TITLE) VALUES
('100'),
('100 SEASON 03'),
('100 SEASON 1'),
('100 SEASON 10'),
('100 SEASON 2'),
('100 SEASON 4'),
('SHOW (2008) SEASON 1'),
('SHOW (2008) SEASON 2'),
('SHOW SEASON 1 (2008)'),
('SHOW SEASON 2 (2008)'),
('SHOW SEASON 10 (2008)'),
('ANOTHER 1ST ANNIVERSARY SEASON 01'),
('ANOTHER 2ND ANNIVERSARY SEASON 01'),
('ANOTHER 10TH ANNIVERSARY SEASON 01'),
('ANOTHER SEASON 01'),
('ANOTHER SEASON 02'),
('SOME SHOW ANOTHER NO SEASON NUMBER'),
('SOME SHOW NO SEASON NUMBER'),
('SHOW 2 THE 75TH ANNIVERSARY SEASON 1'),
('SOME SHOW WITH SEASON NUMBER 1'),
('SOME SHOW WITH SEASON NUMBER 2'),
('SOME SHOW WITH SEASON NUMBER 10')
For the achieved desired result I create a function for split all words and numbers from the text.
(Note: I also remove st from 1st, nd from 2nd etc through function after trim the spaces between 1 st for safe side if any user mistakely type spaces between 1st, so if you think there is no chance of error then you remove LTRIM from that function, because for removing that values it is also remove th if text has value like "1 the title" which will be convert into 1 e title)
--CREATE SPLIT FUNCTION
CREATE FUNCTION [dbo].[SplitAlphaNumeric]
(
@LIST NVARCHAR(2000)
)
RETURNS @RTNVALUE TABLE
(
ID INT IDENTITY(1,1),
WORDS NVARCHAR(100),
NUMBERS INT
)
AS
BEGIN
WHILE (PATINDEX('%[0-9]%',@LIST) > 0)
BEGIN
INSERT INTO @RTNVALUE (WORDS, NUMBERS)
SELECT CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN @LIST
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN ''
ELSE SUBSTRING(@LIST, 1, PATINDEX('%[0-9]%',@LIST) - 1)
END,
CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN NULL
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN CAST(LTRIM(RTRIM(@LIST)) AS INT)
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN SUBSTRING(@LIST, 1, PATINDEX('%[^0-9]%',@LIST) - 1)
ELSE NULL
END
SET @LIST = LTRIM(RTRIM(CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 OR (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN
CASE WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ST%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ND%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'RD%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'TH%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
ELSE LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))
END
ELSE SUBSTRING(@LIST, PATINDEX('%[0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[0-9]%',REVERSE(@LIST)))
END))
END
INSERT INTO @RTNVALUE (WORDS)
SELECT VALUE = LTRIM(RTRIM(@LIST))
RETURN
END
In third step I use cross apply on calling function because function return table against given string value. On select query I insert all columns into temp table for sort values as per requirement in next step.
SELECT T.TITLE, A.ID, A.NUMBERS, A.WORDS INTO #FINAL
FROM #TVSEASON T
CROSS APPLY dbo.SplitAlphaNumeric(TITLE) A
From the temp table #Final I use stuff for concate all words to make title again without any number occurence in the text, and then use that values to order the title.
You can change that query for order in any sequence like if you want
to order against the text then you order first textval column then
numbers, but if you want to order against summation of all the numbers
which are used in title then order numbers first after sum like I do
or else if you want to order on simple number without sum then don't
use group by clause and subquery and directly order against numbers.
In short you can achieved all the sequences respected to alpha numeric
values after modify that below query and the upper one are the base
query for all the goals.
SELECT A.TITLE--, A.NUMBERS, A.TEXTVAL
FROM (
SELECT A.TITLE,
STUFF((
SELECT ' ' + B.WORDS
FROM #FINAL B
WHERE B.TITLE = A.TITLE
FOR XML PATH(''),TYPE).VALUE('(./TEXT())[1]','VARCHAR(MAX)')
,1,1,'') TEXTVAL,
SUM(ISNULL(A.NUMBERS,0)) NUMBERS
FROM #FINAL A
GROUP BY A.TITLE
) A
ORDER BY A.TEXTVAL, A.NUMBERS
DROP TABLE #FINAL
DROP TABLE #TVSEASON
In last I drops both temp table from memory. I think it is the query for sorting values which you want because if anyone have different order requirement agains alphanumeric values they can achieved their requirement after litle bit modify that query.
add a comment |
up vote
2
down vote
This question requirement is complex. So it can't be achieved by a simple query. So my solution is below:
First I create a sample data which will be use in this query.
CREATE TABLE #TVSEASON (TITLE VARCHAR(100));
INSERT INTO #TVSEASON (TITLE) VALUES
('100'),
('100 SEASON 03'),
('100 SEASON 1'),
('100 SEASON 10'),
('100 SEASON 2'),
('100 SEASON 4'),
('SHOW (2008) SEASON 1'),
('SHOW (2008) SEASON 2'),
('SHOW SEASON 1 (2008)'),
('SHOW SEASON 2 (2008)'),
('SHOW SEASON 10 (2008)'),
('ANOTHER 1ST ANNIVERSARY SEASON 01'),
('ANOTHER 2ND ANNIVERSARY SEASON 01'),
('ANOTHER 10TH ANNIVERSARY SEASON 01'),
('ANOTHER SEASON 01'),
('ANOTHER SEASON 02'),
('SOME SHOW ANOTHER NO SEASON NUMBER'),
('SOME SHOW NO SEASON NUMBER'),
('SHOW 2 THE 75TH ANNIVERSARY SEASON 1'),
('SOME SHOW WITH SEASON NUMBER 1'),
('SOME SHOW WITH SEASON NUMBER 2'),
('SOME SHOW WITH SEASON NUMBER 10')
For the achieved desired result I create a function for split all words and numbers from the text.
(Note: I also remove st from 1st, nd from 2nd etc through function after trim the spaces between 1 st for safe side if any user mistakely type spaces between 1st, so if you think there is no chance of error then you remove LTRIM from that function, because for removing that values it is also remove th if text has value like "1 the title" which will be convert into 1 e title)
--CREATE SPLIT FUNCTION
CREATE FUNCTION [dbo].[SplitAlphaNumeric]
(
@LIST NVARCHAR(2000)
)
RETURNS @RTNVALUE TABLE
(
ID INT IDENTITY(1,1),
WORDS NVARCHAR(100),
NUMBERS INT
)
AS
BEGIN
WHILE (PATINDEX('%[0-9]%',@LIST) > 0)
BEGIN
INSERT INTO @RTNVALUE (WORDS, NUMBERS)
SELECT CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN @LIST
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN ''
ELSE SUBSTRING(@LIST, 1, PATINDEX('%[0-9]%',@LIST) - 1)
END,
CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN NULL
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN CAST(LTRIM(RTRIM(@LIST)) AS INT)
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN SUBSTRING(@LIST, 1, PATINDEX('%[^0-9]%',@LIST) - 1)
ELSE NULL
END
SET @LIST = LTRIM(RTRIM(CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 OR (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN
CASE WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ST%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ND%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'RD%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'TH%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
ELSE LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))
END
ELSE SUBSTRING(@LIST, PATINDEX('%[0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[0-9]%',REVERSE(@LIST)))
END))
END
INSERT INTO @RTNVALUE (WORDS)
SELECT VALUE = LTRIM(RTRIM(@LIST))
RETURN
END
In third step I use cross apply on calling function because function return table against given string value. On select query I insert all columns into temp table for sort values as per requirement in next step.
SELECT T.TITLE, A.ID, A.NUMBERS, A.WORDS INTO #FINAL
FROM #TVSEASON T
CROSS APPLY dbo.SplitAlphaNumeric(TITLE) A
From the temp table #Final I use stuff for concate all words to make title again without any number occurence in the text, and then use that values to order the title.
You can change that query for order in any sequence like if you want
to order against the text then you order first textval column then
numbers, but if you want to order against summation of all the numbers
which are used in title then order numbers first after sum like I do
or else if you want to order on simple number without sum then don't
use group by clause and subquery and directly order against numbers.
In short you can achieved all the sequences respected to alpha numeric
values after modify that below query and the upper one are the base
query for all the goals.
SELECT A.TITLE--, A.NUMBERS, A.TEXTVAL
FROM (
SELECT A.TITLE,
STUFF((
SELECT ' ' + B.WORDS
FROM #FINAL B
WHERE B.TITLE = A.TITLE
FOR XML PATH(''),TYPE).VALUE('(./TEXT())[1]','VARCHAR(MAX)')
,1,1,'') TEXTVAL,
SUM(ISNULL(A.NUMBERS,0)) NUMBERS
FROM #FINAL A
GROUP BY A.TITLE
) A
ORDER BY A.TEXTVAL, A.NUMBERS
DROP TABLE #FINAL
DROP TABLE #TVSEASON
In last I drops both temp table from memory. I think it is the query for sorting values which you want because if anyone have different order requirement agains alphanumeric values they can achieved their requirement after litle bit modify that query.
add a comment |
up vote
2
down vote
up vote
2
down vote
This question requirement is complex. So it can't be achieved by a simple query. So my solution is below:
First I create a sample data which will be use in this query.
CREATE TABLE #TVSEASON (TITLE VARCHAR(100));
INSERT INTO #TVSEASON (TITLE) VALUES
('100'),
('100 SEASON 03'),
('100 SEASON 1'),
('100 SEASON 10'),
('100 SEASON 2'),
('100 SEASON 4'),
('SHOW (2008) SEASON 1'),
('SHOW (2008) SEASON 2'),
('SHOW SEASON 1 (2008)'),
('SHOW SEASON 2 (2008)'),
('SHOW SEASON 10 (2008)'),
('ANOTHER 1ST ANNIVERSARY SEASON 01'),
('ANOTHER 2ND ANNIVERSARY SEASON 01'),
('ANOTHER 10TH ANNIVERSARY SEASON 01'),
('ANOTHER SEASON 01'),
('ANOTHER SEASON 02'),
('SOME SHOW ANOTHER NO SEASON NUMBER'),
('SOME SHOW NO SEASON NUMBER'),
('SHOW 2 THE 75TH ANNIVERSARY SEASON 1'),
('SOME SHOW WITH SEASON NUMBER 1'),
('SOME SHOW WITH SEASON NUMBER 2'),
('SOME SHOW WITH SEASON NUMBER 10')
For the achieved desired result I create a function for split all words and numbers from the text.
(Note: I also remove st from 1st, nd from 2nd etc through function after trim the spaces between 1 st for safe side if any user mistakely type spaces between 1st, so if you think there is no chance of error then you remove LTRIM from that function, because for removing that values it is also remove th if text has value like "1 the title" which will be convert into 1 e title)
--CREATE SPLIT FUNCTION
CREATE FUNCTION [dbo].[SplitAlphaNumeric]
(
@LIST NVARCHAR(2000)
)
RETURNS @RTNVALUE TABLE
(
ID INT IDENTITY(1,1),
WORDS NVARCHAR(100),
NUMBERS INT
)
AS
BEGIN
WHILE (PATINDEX('%[0-9]%',@LIST) > 0)
BEGIN
INSERT INTO @RTNVALUE (WORDS, NUMBERS)
SELECT CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN @LIST
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN ''
ELSE SUBSTRING(@LIST, 1, PATINDEX('%[0-9]%',@LIST) - 1)
END,
CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN NULL
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN CAST(LTRIM(RTRIM(@LIST)) AS INT)
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN SUBSTRING(@LIST, 1, PATINDEX('%[^0-9]%',@LIST) - 1)
ELSE NULL
END
SET @LIST = LTRIM(RTRIM(CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 OR (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN
CASE WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ST%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ND%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'RD%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'TH%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
ELSE LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))
END
ELSE SUBSTRING(@LIST, PATINDEX('%[0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[0-9]%',REVERSE(@LIST)))
END))
END
INSERT INTO @RTNVALUE (WORDS)
SELECT VALUE = LTRIM(RTRIM(@LIST))
RETURN
END
In third step I use cross apply on calling function because function return table against given string value. On select query I insert all columns into temp table for sort values as per requirement in next step.
SELECT T.TITLE, A.ID, A.NUMBERS, A.WORDS INTO #FINAL
FROM #TVSEASON T
CROSS APPLY dbo.SplitAlphaNumeric(TITLE) A
From the temp table #Final I use stuff for concate all words to make title again without any number occurence in the text, and then use that values to order the title.
You can change that query for order in any sequence like if you want
to order against the text then you order first textval column then
numbers, but if you want to order against summation of all the numbers
which are used in title then order numbers first after sum like I do
or else if you want to order on simple number without sum then don't
use group by clause and subquery and directly order against numbers.
In short you can achieved all the sequences respected to alpha numeric
values after modify that below query and the upper one are the base
query for all the goals.
SELECT A.TITLE--, A.NUMBERS, A.TEXTVAL
FROM (
SELECT A.TITLE,
STUFF((
SELECT ' ' + B.WORDS
FROM #FINAL B
WHERE B.TITLE = A.TITLE
FOR XML PATH(''),TYPE).VALUE('(./TEXT())[1]','VARCHAR(MAX)')
,1,1,'') TEXTVAL,
SUM(ISNULL(A.NUMBERS,0)) NUMBERS
FROM #FINAL A
GROUP BY A.TITLE
) A
ORDER BY A.TEXTVAL, A.NUMBERS
DROP TABLE #FINAL
DROP TABLE #TVSEASON
In last I drops both temp table from memory. I think it is the query for sorting values which you want because if anyone have different order requirement agains alphanumeric values they can achieved their requirement after litle bit modify that query.
This question requirement is complex. So it can't be achieved by a simple query. So my solution is below:
First I create a sample data which will be use in this query.
CREATE TABLE #TVSEASON (TITLE VARCHAR(100));
INSERT INTO #TVSEASON (TITLE) VALUES
('100'),
('100 SEASON 03'),
('100 SEASON 1'),
('100 SEASON 10'),
('100 SEASON 2'),
('100 SEASON 4'),
('SHOW (2008) SEASON 1'),
('SHOW (2008) SEASON 2'),
('SHOW SEASON 1 (2008)'),
('SHOW SEASON 2 (2008)'),
('SHOW SEASON 10 (2008)'),
('ANOTHER 1ST ANNIVERSARY SEASON 01'),
('ANOTHER 2ND ANNIVERSARY SEASON 01'),
('ANOTHER 10TH ANNIVERSARY SEASON 01'),
('ANOTHER SEASON 01'),
('ANOTHER SEASON 02'),
('SOME SHOW ANOTHER NO SEASON NUMBER'),
('SOME SHOW NO SEASON NUMBER'),
('SHOW 2 THE 75TH ANNIVERSARY SEASON 1'),
('SOME SHOW WITH SEASON NUMBER 1'),
('SOME SHOW WITH SEASON NUMBER 2'),
('SOME SHOW WITH SEASON NUMBER 10')
For the achieved desired result I create a function for split all words and numbers from the text.
(Note: I also remove st from 1st, nd from 2nd etc through function after trim the spaces between 1 st for safe side if any user mistakely type spaces between 1st, so if you think there is no chance of error then you remove LTRIM from that function, because for removing that values it is also remove th if text has value like "1 the title" which will be convert into 1 e title)
--CREATE SPLIT FUNCTION
CREATE FUNCTION [dbo].[SplitAlphaNumeric]
(
@LIST NVARCHAR(2000)
)
RETURNS @RTNVALUE TABLE
(
ID INT IDENTITY(1,1),
WORDS NVARCHAR(100),
NUMBERS INT
)
AS
BEGIN
WHILE (PATINDEX('%[0-9]%',@LIST) > 0)
BEGIN
INSERT INTO @RTNVALUE (WORDS, NUMBERS)
SELECT CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN @LIST
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN ''
ELSE SUBSTRING(@LIST, 1, PATINDEX('%[0-9]%',@LIST) - 1)
END,
CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 THEN NULL
WHEN (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN CAST(LTRIM(RTRIM(@LIST)) AS INT)
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN SUBSTRING(@LIST, 1, PATINDEX('%[^0-9]%',@LIST) - 1)
ELSE NULL
END
SET @LIST = LTRIM(RTRIM(CASE WHEN PATINDEX('%[0-9]%',@LIST) = 0 OR (PATINDEX('%[0-9]%',@LIST) = 1 AND PATINDEX('%[^0-9]%',@LIST) = 0) THEN ''
WHEN PATINDEX('%[0-9]%',@LIST) = 1 THEN
CASE WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ST%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'ND%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'RD%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
WHEN LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))) LIKE 'TH%' THEN SUBSTRING(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST)))),3, LEN(LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))))
ELSE LTRIM(SUBSTRING(@LIST, PATINDEX('%[^0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[^0-9]%',REVERSE(@LIST))))
END
ELSE SUBSTRING(@LIST, PATINDEX('%[0-9]%',@LIST), LEN(@LIST)-PATINDEX('%[0-9]%',REVERSE(@LIST)))
END))
END
INSERT INTO @RTNVALUE (WORDS)
SELECT VALUE = LTRIM(RTRIM(@LIST))
RETURN
END
In third step I use cross apply on calling function because function return table against given string value. On select query I insert all columns into temp table for sort values as per requirement in next step.
SELECT T.TITLE, A.ID, A.NUMBERS, A.WORDS INTO #FINAL
FROM #TVSEASON T
CROSS APPLY dbo.SplitAlphaNumeric(TITLE) A
From the temp table #Final I use stuff for concate all words to make title again without any number occurence in the text, and then use that values to order the title.
You can change that query for order in any sequence like if you want
to order against the text then you order first textval column then
numbers, but if you want to order against summation of all the numbers
which are used in title then order numbers first after sum like I do
or else if you want to order on simple number without sum then don't
use group by clause and subquery and directly order against numbers.
In short you can achieved all the sequences respected to alpha numeric
values after modify that below query and the upper one are the base
query for all the goals.
SELECT A.TITLE--, A.NUMBERS, A.TEXTVAL
FROM (
SELECT A.TITLE,
STUFF((
SELECT ' ' + B.WORDS
FROM #FINAL B
WHERE B.TITLE = A.TITLE
FOR XML PATH(''),TYPE).VALUE('(./TEXT())[1]','VARCHAR(MAX)')
,1,1,'') TEXTVAL,
SUM(ISNULL(A.NUMBERS,0)) NUMBERS
FROM #FINAL A
GROUP BY A.TITLE
) A
ORDER BY A.TEXTVAL, A.NUMBERS
DROP TABLE #FINAL
DROP TABLE #TVSEASON
In last I drops both temp table from memory. I think it is the query for sorting values which you want because if anyone have different order requirement agains alphanumeric values they can achieved their requirement after litle bit modify that query.
edited Nov 16 at 15:34
answered Nov 16 at 13:20
Adnan Ahmed Ansari
715
715
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f49374610%2fnatural-human-alpha-numeric-sort-in-microsoft-sql-server%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown