how to practice large SQL Server index or partition issues for interview?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Sorry if this should be on careers or workplace stack exchanges. It seemed to cross boundaries so I wasn't sure which to put it on.
I have an onsite, technical interview coming up, at a highly desirable company, but it's way outside my current skill set. I'm a senior level Java guy. This opportunity is not only crossing over to the .NET stack, but on a team where they're doing multi-threaded queries on a massive SQL Server database (billions of rows, perhaps?)
At one point the manager said, "I don't want to waste your time, but I can bring you in if you want..." I'm surprised he didn't simply say, sorry, you don't have the skills we need. I am completely out of my element. I've done various amounts and depths of SQL in my career, but not to this scale, and not really recently (done more noSQL recently like Mongo and Amazon Dynamo). But I'm going through with it because I'm looking for any in at this place (and hate current job, and have time to spare to prepare).
I've already done basic searches like SQL Server technical interview questions and have begun going through them, but that can't replace actual experience. Is there any way to set up my own large SQL Server with some big public data to practice some of this?
.net sql-server multithreading bigdata
add a comment |
Sorry if this should be on careers or workplace stack exchanges. It seemed to cross boundaries so I wasn't sure which to put it on.
I have an onsite, technical interview coming up, at a highly desirable company, but it's way outside my current skill set. I'm a senior level Java guy. This opportunity is not only crossing over to the .NET stack, but on a team where they're doing multi-threaded queries on a massive SQL Server database (billions of rows, perhaps?)
At one point the manager said, "I don't want to waste your time, but I can bring you in if you want..." I'm surprised he didn't simply say, sorry, you don't have the skills we need. I am completely out of my element. I've done various amounts and depths of SQL in my career, but not to this scale, and not really recently (done more noSQL recently like Mongo and Amazon Dynamo). But I'm going through with it because I'm looking for any in at this place (and hate current job, and have time to spare to prepare).
I've already done basic searches like SQL Server technical interview questions and have begun going through them, but that can't replace actual experience. Is there any way to set up my own large SQL Server with some big public data to practice some of this?
.net sql-server multithreading bigdata
2
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
1
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13
add a comment |
Sorry if this should be on careers or workplace stack exchanges. It seemed to cross boundaries so I wasn't sure which to put it on.
I have an onsite, technical interview coming up, at a highly desirable company, but it's way outside my current skill set. I'm a senior level Java guy. This opportunity is not only crossing over to the .NET stack, but on a team where they're doing multi-threaded queries on a massive SQL Server database (billions of rows, perhaps?)
At one point the manager said, "I don't want to waste your time, but I can bring you in if you want..." I'm surprised he didn't simply say, sorry, you don't have the skills we need. I am completely out of my element. I've done various amounts and depths of SQL in my career, but not to this scale, and not really recently (done more noSQL recently like Mongo and Amazon Dynamo). But I'm going through with it because I'm looking for any in at this place (and hate current job, and have time to spare to prepare).
I've already done basic searches like SQL Server technical interview questions and have begun going through them, but that can't replace actual experience. Is there any way to set up my own large SQL Server with some big public data to practice some of this?
.net sql-server multithreading bigdata
Sorry if this should be on careers or workplace stack exchanges. It seemed to cross boundaries so I wasn't sure which to put it on.
I have an onsite, technical interview coming up, at a highly desirable company, but it's way outside my current skill set. I'm a senior level Java guy. This opportunity is not only crossing over to the .NET stack, but on a team where they're doing multi-threaded queries on a massive SQL Server database (billions of rows, perhaps?)
At one point the manager said, "I don't want to waste your time, but I can bring you in if you want..." I'm surprised he didn't simply say, sorry, you don't have the skills we need. I am completely out of my element. I've done various amounts and depths of SQL in my career, but not to this scale, and not really recently (done more noSQL recently like Mongo and Amazon Dynamo). But I'm going through with it because I'm looking for any in at this place (and hate current job, and have time to spare to prepare).
I've already done basic searches like SQL Server technical interview questions and have begun going through them, but that can't replace actual experience. Is there any way to set up my own large SQL Server with some big public data to practice some of this?
.net sql-server multithreading bigdata
.net sql-server multithreading bigdata
asked Nov 16 '18 at 20:39
user26270user26270
2,808124070
2,808124070
2
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
1
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13
add a comment |
2
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
1
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13
2
2
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
1
1
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13
add a comment |
2 Answers
2
active
oldest
votes
I just would like to show where I would start.
Create table without VARCHAR(MAX) to avoid performance issues
CREATE TABLE Person
(
PersonID INT NOT NULL,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Address VARCHAR(50)
)
Insert some test data. The table should not have any indexes before loading data as indexes decrease performance of inserting
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
DECLARE @values TABLE (DataValue int, RandValue INT)
;WITH mycte AS
(
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 1000000
)
INSERT INTO Person(PersonID, FirstName, LastName, Address)
SELECT
DataValue
, 'FirstName' + CAST(DataValue AS VARCHAR(10))
, 'LastName' + CAST(DataValue AS VARCHAR(10))
, 'Address' + CAST(DataValue AS VARCHAR(10))
FROM mycte m
OPTION (MAXRECURSION 0)
Then create indexes.
-- Clustered index
ALTER TABLE dbo.Person ADD CONSTRAINT PK_PersonID PRIMARY KEY CLUSTERED
(PersonID);
CREATE NONCLUSTERED INDEX [IX_Person_FirstName_LastName]
ON dbo.Person
(FirstName, LastName) INCLUDE(Address)
Practice query tuning. Click
Ctrl + M
in SSMS to see query plan. Compare query plans. Try to tune queryQuery 2
:
-- Query 1
SELECT * FROM dbo.Person p
WHERE p.FirstName = 'FirstName1'
-- Query 2
SELECT * FROM dbo.Person p
WHERE p.Address = 'Address1'
When you make I/O operations, then you should use async operations. So you need to
use some ORM. For example, Entity Framework
In my experince, I have not founded a reason to use multithreading in ASP.NET MVC
when you work with IO, as all logic is executing in database. So optimizer of
SQL Server makes parallelization by itself. So there is no need to use
multithreading when you work with IO operations.
public async Task<List<Person>> GetAllPersonsWithName(string personName)
{
return await db.Persons.Where(u => u.FirstName == personName).ToListAsync();
}
Then try to run stored procedures asynchronously (your team will be glad if you use asynchronoous operations)
Task<int> firstResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure @param",
new SqlParameter("@param", yourParam));
Task<int> secondResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure1 @param1",
new SqlParameter("@param1", yourParam1));
await Task.WhenAll(mcResult, dcaiResult);
int fr = firstResult.Result;
int sr = secondResult.Result;
Console.WriteLine($"Results are fr {fr}, sr {sr} ");
And some more advices when you will write C#
and T-SQL
code:
The C# code must be
async
. Asasync/await
saves threads. High performance and scalability must not work with one-thread-per-socket. Good article to read. This code is really cool!
In addition, if your future team uses EntityFramework, then use method
.AsNoTracking()
when you selecting data, not adding or updating:
context.YourTable.AsNoTracking()
try to avoid writing complex
LINQ to Entities
queries. AsEntityFramework
can create your variables with type likeVARCHAR(MAX)
. And this decreases performance. Instead just create User-Defined Function. For complicated logic usestored procedures
,views
,user-defined functions
.Try to tune queries. Read this post about query tuning. In addition,make test examples and try to achieve index seek operation, find the way to solve key lookups( avoid any
implicitexplicit conversions
inWHERE
statement, add columns to index orinclude
this column). It is possible to make tuning queries inAdventureWorks
database.
add a comment |
(I agree that this is a borderline question between careers/workplace and SO, but OTOH the question is more on a technical/software developer level so let's attempt to answer it.)
First, congratulations to the opportunity! I recently switched jobs from one technology stack (Ruby and some C#) to a rather different stack (large Java application with 10+ years on its neck), so I can definitely understand if this feels both challenging and a bit frightening at the same time.
So let's focus on some aspects here:
You have done SQL queries before, I assume you have a general understanding of relational algebra,
INNER
&OUTER
joins, the importance of proper indexes to avoid unnecessary table scans - this kind of basic stuff. (if not, it's definitely areas where you could train a bit.)
You talk about doing "multi-threaded queries on a massive SQL server database" - potentially billions of rows. I cannot say if this is likely or not; most systems do not have such big databases, but if you have had indications about it being that big, I see no reason to challenge that.
But multi-threaded is a key here. As anyone who has done multi-threaded programming knows, the challenges you face there are different to the problems you face in traditional, sequential programming. I am thinking about locking, trying to avoid deadlocks, trying to avoid locking data in the first place, and things like that. It's worth remembering that Microsoft SQL Server has a different locking strategy than some other databases (depending on what you've used before), so learning when to use (and when not to use)
READ UNCOMMITTED
can be critical sometimes. More information: Why use a READ UNCOMMITTED isolation level?
You also write in your question:
Is there any way to set up my own large SQL Server with some big public data to practice some of this?
I'm not sure I would recommend this. I mean, if you don't know a subject, you can't really fake it, and trying to learn more complex topics just for an interview seldom works really well. (sorry to sound a bit negative here)
It also depends greatly on the company where you're being interviewed. I've experienced having to go through coding test, phone interview plus challenging five-hour on-site interview and still don't get the job... while at another company, I just had lunch with the manager (whom I knew a bit from beforehand) and then a phone interview with a couple of key people + emailing a bit back and forth until I got an offer. So the requirements and the process varies greatly between companies.
I believe in honesty, even when it comes to job seeking. Be yourself, and don't be afraid of the areas where you feel you are lacking. If you have many years of Java experience, I'm sure you can learn the .NET/C# mindset quite easily, especially if you have worked with multi-threaded programming, perhaps dependency injection/inversion control frameworks (Spring etc).
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345076%2fhow-to-practice-large-sql-server-index-or-partition-issues-for-interview%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I just would like to show where I would start.
Create table without VARCHAR(MAX) to avoid performance issues
CREATE TABLE Person
(
PersonID INT NOT NULL,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Address VARCHAR(50)
)
Insert some test data. The table should not have any indexes before loading data as indexes decrease performance of inserting
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
DECLARE @values TABLE (DataValue int, RandValue INT)
;WITH mycte AS
(
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 1000000
)
INSERT INTO Person(PersonID, FirstName, LastName, Address)
SELECT
DataValue
, 'FirstName' + CAST(DataValue AS VARCHAR(10))
, 'LastName' + CAST(DataValue AS VARCHAR(10))
, 'Address' + CAST(DataValue AS VARCHAR(10))
FROM mycte m
OPTION (MAXRECURSION 0)
Then create indexes.
-- Clustered index
ALTER TABLE dbo.Person ADD CONSTRAINT PK_PersonID PRIMARY KEY CLUSTERED
(PersonID);
CREATE NONCLUSTERED INDEX [IX_Person_FirstName_LastName]
ON dbo.Person
(FirstName, LastName) INCLUDE(Address)
Practice query tuning. Click
Ctrl + M
in SSMS to see query plan. Compare query plans. Try to tune queryQuery 2
:
-- Query 1
SELECT * FROM dbo.Person p
WHERE p.FirstName = 'FirstName1'
-- Query 2
SELECT * FROM dbo.Person p
WHERE p.Address = 'Address1'
When you make I/O operations, then you should use async operations. So you need to
use some ORM. For example, Entity Framework
In my experince, I have not founded a reason to use multithreading in ASP.NET MVC
when you work with IO, as all logic is executing in database. So optimizer of
SQL Server makes parallelization by itself. So there is no need to use
multithreading when you work with IO operations.
public async Task<List<Person>> GetAllPersonsWithName(string personName)
{
return await db.Persons.Where(u => u.FirstName == personName).ToListAsync();
}
Then try to run stored procedures asynchronously (your team will be glad if you use asynchronoous operations)
Task<int> firstResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure @param",
new SqlParameter("@param", yourParam));
Task<int> secondResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure1 @param1",
new SqlParameter("@param1", yourParam1));
await Task.WhenAll(mcResult, dcaiResult);
int fr = firstResult.Result;
int sr = secondResult.Result;
Console.WriteLine($"Results are fr {fr}, sr {sr} ");
And some more advices when you will write C#
and T-SQL
code:
The C# code must be
async
. Asasync/await
saves threads. High performance and scalability must not work with one-thread-per-socket. Good article to read. This code is really cool!
In addition, if your future team uses EntityFramework, then use method
.AsNoTracking()
when you selecting data, not adding or updating:
context.YourTable.AsNoTracking()
try to avoid writing complex
LINQ to Entities
queries. AsEntityFramework
can create your variables with type likeVARCHAR(MAX)
. And this decreases performance. Instead just create User-Defined Function. For complicated logic usestored procedures
,views
,user-defined functions
.Try to tune queries. Read this post about query tuning. In addition,make test examples and try to achieve index seek operation, find the way to solve key lookups( avoid any
implicitexplicit conversions
inWHERE
statement, add columns to index orinclude
this column). It is possible to make tuning queries inAdventureWorks
database.
add a comment |
I just would like to show where I would start.
Create table without VARCHAR(MAX) to avoid performance issues
CREATE TABLE Person
(
PersonID INT NOT NULL,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Address VARCHAR(50)
)
Insert some test data. The table should not have any indexes before loading data as indexes decrease performance of inserting
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
DECLARE @values TABLE (DataValue int, RandValue INT)
;WITH mycte AS
(
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 1000000
)
INSERT INTO Person(PersonID, FirstName, LastName, Address)
SELECT
DataValue
, 'FirstName' + CAST(DataValue AS VARCHAR(10))
, 'LastName' + CAST(DataValue AS VARCHAR(10))
, 'Address' + CAST(DataValue AS VARCHAR(10))
FROM mycte m
OPTION (MAXRECURSION 0)
Then create indexes.
-- Clustered index
ALTER TABLE dbo.Person ADD CONSTRAINT PK_PersonID PRIMARY KEY CLUSTERED
(PersonID);
CREATE NONCLUSTERED INDEX [IX_Person_FirstName_LastName]
ON dbo.Person
(FirstName, LastName) INCLUDE(Address)
Practice query tuning. Click
Ctrl + M
in SSMS to see query plan. Compare query plans. Try to tune queryQuery 2
:
-- Query 1
SELECT * FROM dbo.Person p
WHERE p.FirstName = 'FirstName1'
-- Query 2
SELECT * FROM dbo.Person p
WHERE p.Address = 'Address1'
When you make I/O operations, then you should use async operations. So you need to
use some ORM. For example, Entity Framework
In my experince, I have not founded a reason to use multithreading in ASP.NET MVC
when you work with IO, as all logic is executing in database. So optimizer of
SQL Server makes parallelization by itself. So there is no need to use
multithreading when you work with IO operations.
public async Task<List<Person>> GetAllPersonsWithName(string personName)
{
return await db.Persons.Where(u => u.FirstName == personName).ToListAsync();
}
Then try to run stored procedures asynchronously (your team will be glad if you use asynchronoous operations)
Task<int> firstResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure @param",
new SqlParameter("@param", yourParam));
Task<int> secondResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure1 @param1",
new SqlParameter("@param1", yourParam1));
await Task.WhenAll(mcResult, dcaiResult);
int fr = firstResult.Result;
int sr = secondResult.Result;
Console.WriteLine($"Results are fr {fr}, sr {sr} ");
And some more advices when you will write C#
and T-SQL
code:
The C# code must be
async
. Asasync/await
saves threads. High performance and scalability must not work with one-thread-per-socket. Good article to read. This code is really cool!
In addition, if your future team uses EntityFramework, then use method
.AsNoTracking()
when you selecting data, not adding or updating:
context.YourTable.AsNoTracking()
try to avoid writing complex
LINQ to Entities
queries. AsEntityFramework
can create your variables with type likeVARCHAR(MAX)
. And this decreases performance. Instead just create User-Defined Function. For complicated logic usestored procedures
,views
,user-defined functions
.Try to tune queries. Read this post about query tuning. In addition,make test examples and try to achieve index seek operation, find the way to solve key lookups( avoid any
implicitexplicit conversions
inWHERE
statement, add columns to index orinclude
this column). It is possible to make tuning queries inAdventureWorks
database.
add a comment |
I just would like to show where I would start.
Create table without VARCHAR(MAX) to avoid performance issues
CREATE TABLE Person
(
PersonID INT NOT NULL,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Address VARCHAR(50)
)
Insert some test data. The table should not have any indexes before loading data as indexes decrease performance of inserting
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
DECLARE @values TABLE (DataValue int, RandValue INT)
;WITH mycte AS
(
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 1000000
)
INSERT INTO Person(PersonID, FirstName, LastName, Address)
SELECT
DataValue
, 'FirstName' + CAST(DataValue AS VARCHAR(10))
, 'LastName' + CAST(DataValue AS VARCHAR(10))
, 'Address' + CAST(DataValue AS VARCHAR(10))
FROM mycte m
OPTION (MAXRECURSION 0)
Then create indexes.
-- Clustered index
ALTER TABLE dbo.Person ADD CONSTRAINT PK_PersonID PRIMARY KEY CLUSTERED
(PersonID);
CREATE NONCLUSTERED INDEX [IX_Person_FirstName_LastName]
ON dbo.Person
(FirstName, LastName) INCLUDE(Address)
Practice query tuning. Click
Ctrl + M
in SSMS to see query plan. Compare query plans. Try to tune queryQuery 2
:
-- Query 1
SELECT * FROM dbo.Person p
WHERE p.FirstName = 'FirstName1'
-- Query 2
SELECT * FROM dbo.Person p
WHERE p.Address = 'Address1'
When you make I/O operations, then you should use async operations. So you need to
use some ORM. For example, Entity Framework
In my experince, I have not founded a reason to use multithreading in ASP.NET MVC
when you work with IO, as all logic is executing in database. So optimizer of
SQL Server makes parallelization by itself. So there is no need to use
multithreading when you work with IO operations.
public async Task<List<Person>> GetAllPersonsWithName(string personName)
{
return await db.Persons.Where(u => u.FirstName == personName).ToListAsync();
}
Then try to run stored procedures asynchronously (your team will be glad if you use asynchronoous operations)
Task<int> firstResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure @param",
new SqlParameter("@param", yourParam));
Task<int> secondResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure1 @param1",
new SqlParameter("@param1", yourParam1));
await Task.WhenAll(mcResult, dcaiResult);
int fr = firstResult.Result;
int sr = secondResult.Result;
Console.WriteLine($"Results are fr {fr}, sr {sr} ");
And some more advices when you will write C#
and T-SQL
code:
The C# code must be
async
. Asasync/await
saves threads. High performance and scalability must not work with one-thread-per-socket. Good article to read. This code is really cool!
In addition, if your future team uses EntityFramework, then use method
.AsNoTracking()
when you selecting data, not adding or updating:
context.YourTable.AsNoTracking()
try to avoid writing complex
LINQ to Entities
queries. AsEntityFramework
can create your variables with type likeVARCHAR(MAX)
. And this decreases performance. Instead just create User-Defined Function. For complicated logic usestored procedures
,views
,user-defined functions
.Try to tune queries. Read this post about query tuning. In addition,make test examples and try to achieve index seek operation, find the way to solve key lookups( avoid any
implicitexplicit conversions
inWHERE
statement, add columns to index orinclude
this column). It is possible to make tuning queries inAdventureWorks
database.
I just would like to show where I would start.
Create table without VARCHAR(MAX) to avoid performance issues
CREATE TABLE Person
(
PersonID INT NOT NULL,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Address VARCHAR(50)
)
Insert some test data. The table should not have any indexes before loading data as indexes decrease performance of inserting
DECLARE @RunDate datetime
SET @RunDate=GETDATE()
DECLARE @values TABLE (DataValue int, RandValue INT)
;WITH mycte AS
(
SELECT 1 DataValue
UNION all
SELECT DataValue + 1
FROM mycte
WHERE DataValue + 1 <= 1000000
)
INSERT INTO Person(PersonID, FirstName, LastName, Address)
SELECT
DataValue
, 'FirstName' + CAST(DataValue AS VARCHAR(10))
, 'LastName' + CAST(DataValue AS VARCHAR(10))
, 'Address' + CAST(DataValue AS VARCHAR(10))
FROM mycte m
OPTION (MAXRECURSION 0)
Then create indexes.
-- Clustered index
ALTER TABLE dbo.Person ADD CONSTRAINT PK_PersonID PRIMARY KEY CLUSTERED
(PersonID);
CREATE NONCLUSTERED INDEX [IX_Person_FirstName_LastName]
ON dbo.Person
(FirstName, LastName) INCLUDE(Address)
Practice query tuning. Click
Ctrl + M
in SSMS to see query plan. Compare query plans. Try to tune queryQuery 2
:
-- Query 1
SELECT * FROM dbo.Person p
WHERE p.FirstName = 'FirstName1'
-- Query 2
SELECT * FROM dbo.Person p
WHERE p.Address = 'Address1'
When you make I/O operations, then you should use async operations. So you need to
use some ORM. For example, Entity Framework
In my experince, I have not founded a reason to use multithreading in ASP.NET MVC
when you work with IO, as all logic is executing in database. So optimizer of
SQL Server makes parallelization by itself. So there is no need to use
multithreading when you work with IO operations.
public async Task<List<Person>> GetAllPersonsWithName(string personName)
{
return await db.Persons.Where(u => u.FirstName == personName).ToListAsync();
}
Then try to run stored procedures asynchronously (your team will be glad if you use asynchronoous operations)
Task<int> firstResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure @param",
new SqlParameter("@param", yourParam));
Task<int> secondResult =
db.Database.ExecuteSqlCommandAsync("exec FooProcedure1 @param1",
new SqlParameter("@param1", yourParam1));
await Task.WhenAll(mcResult, dcaiResult);
int fr = firstResult.Result;
int sr = secondResult.Result;
Console.WriteLine($"Results are fr {fr}, sr {sr} ");
And some more advices when you will write C#
and T-SQL
code:
The C# code must be
async
. Asasync/await
saves threads. High performance and scalability must not work with one-thread-per-socket. Good article to read. This code is really cool!
In addition, if your future team uses EntityFramework, then use method
.AsNoTracking()
when you selecting data, not adding or updating:
context.YourTable.AsNoTracking()
try to avoid writing complex
LINQ to Entities
queries. AsEntityFramework
can create your variables with type likeVARCHAR(MAX)
. And this decreases performance. Instead just create User-Defined Function. For complicated logic usestored procedures
,views
,user-defined functions
.Try to tune queries. Read this post about query tuning. In addition,make test examples and try to achieve index seek operation, find the way to solve key lookups( avoid any
implicitexplicit conversions
inWHERE
statement, add columns to index orinclude
this column). It is possible to make tuning queries inAdventureWorks
database.
edited Nov 21 '18 at 11:42
answered Nov 19 '18 at 13:00
StepUpStepUp
8,81284576
8,81284576
add a comment |
add a comment |
(I agree that this is a borderline question between careers/workplace and SO, but OTOH the question is more on a technical/software developer level so let's attempt to answer it.)
First, congratulations to the opportunity! I recently switched jobs from one technology stack (Ruby and some C#) to a rather different stack (large Java application with 10+ years on its neck), so I can definitely understand if this feels both challenging and a bit frightening at the same time.
So let's focus on some aspects here:
You have done SQL queries before, I assume you have a general understanding of relational algebra,
INNER
&OUTER
joins, the importance of proper indexes to avoid unnecessary table scans - this kind of basic stuff. (if not, it's definitely areas where you could train a bit.)
You talk about doing "multi-threaded queries on a massive SQL server database" - potentially billions of rows. I cannot say if this is likely or not; most systems do not have such big databases, but if you have had indications about it being that big, I see no reason to challenge that.
But multi-threaded is a key here. As anyone who has done multi-threaded programming knows, the challenges you face there are different to the problems you face in traditional, sequential programming. I am thinking about locking, trying to avoid deadlocks, trying to avoid locking data in the first place, and things like that. It's worth remembering that Microsoft SQL Server has a different locking strategy than some other databases (depending on what you've used before), so learning when to use (and when not to use)
READ UNCOMMITTED
can be critical sometimes. More information: Why use a READ UNCOMMITTED isolation level?
You also write in your question:
Is there any way to set up my own large SQL Server with some big public data to practice some of this?
I'm not sure I would recommend this. I mean, if you don't know a subject, you can't really fake it, and trying to learn more complex topics just for an interview seldom works really well. (sorry to sound a bit negative here)
It also depends greatly on the company where you're being interviewed. I've experienced having to go through coding test, phone interview plus challenging five-hour on-site interview and still don't get the job... while at another company, I just had lunch with the manager (whom I knew a bit from beforehand) and then a phone interview with a couple of key people + emailing a bit back and forth until I got an offer. So the requirements and the process varies greatly between companies.
I believe in honesty, even when it comes to job seeking. Be yourself, and don't be afraid of the areas where you feel you are lacking. If you have many years of Java experience, I'm sure you can learn the .NET/C# mindset quite easily, especially if you have worked with multi-threaded programming, perhaps dependency injection/inversion control frameworks (Spring etc).
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
add a comment |
(I agree that this is a borderline question between careers/workplace and SO, but OTOH the question is more on a technical/software developer level so let's attempt to answer it.)
First, congratulations to the opportunity! I recently switched jobs from one technology stack (Ruby and some C#) to a rather different stack (large Java application with 10+ years on its neck), so I can definitely understand if this feels both challenging and a bit frightening at the same time.
So let's focus on some aspects here:
You have done SQL queries before, I assume you have a general understanding of relational algebra,
INNER
&OUTER
joins, the importance of proper indexes to avoid unnecessary table scans - this kind of basic stuff. (if not, it's definitely areas where you could train a bit.)
You talk about doing "multi-threaded queries on a massive SQL server database" - potentially billions of rows. I cannot say if this is likely or not; most systems do not have such big databases, but if you have had indications about it being that big, I see no reason to challenge that.
But multi-threaded is a key here. As anyone who has done multi-threaded programming knows, the challenges you face there are different to the problems you face in traditional, sequential programming. I am thinking about locking, trying to avoid deadlocks, trying to avoid locking data in the first place, and things like that. It's worth remembering that Microsoft SQL Server has a different locking strategy than some other databases (depending on what you've used before), so learning when to use (and when not to use)
READ UNCOMMITTED
can be critical sometimes. More information: Why use a READ UNCOMMITTED isolation level?
You also write in your question:
Is there any way to set up my own large SQL Server with some big public data to practice some of this?
I'm not sure I would recommend this. I mean, if you don't know a subject, you can't really fake it, and trying to learn more complex topics just for an interview seldom works really well. (sorry to sound a bit negative here)
It also depends greatly on the company where you're being interviewed. I've experienced having to go through coding test, phone interview plus challenging five-hour on-site interview and still don't get the job... while at another company, I just had lunch with the manager (whom I knew a bit from beforehand) and then a phone interview with a couple of key people + emailing a bit back and forth until I got an offer. So the requirements and the process varies greatly between companies.
I believe in honesty, even when it comes to job seeking. Be yourself, and don't be afraid of the areas where you feel you are lacking. If you have many years of Java experience, I'm sure you can learn the .NET/C# mindset quite easily, especially if you have worked with multi-threaded programming, perhaps dependency injection/inversion control frameworks (Spring etc).
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
add a comment |
(I agree that this is a borderline question between careers/workplace and SO, but OTOH the question is more on a technical/software developer level so let's attempt to answer it.)
First, congratulations to the opportunity! I recently switched jobs from one technology stack (Ruby and some C#) to a rather different stack (large Java application with 10+ years on its neck), so I can definitely understand if this feels both challenging and a bit frightening at the same time.
So let's focus on some aspects here:
You have done SQL queries before, I assume you have a general understanding of relational algebra,
INNER
&OUTER
joins, the importance of proper indexes to avoid unnecessary table scans - this kind of basic stuff. (if not, it's definitely areas where you could train a bit.)
You talk about doing "multi-threaded queries on a massive SQL server database" - potentially billions of rows. I cannot say if this is likely or not; most systems do not have such big databases, but if you have had indications about it being that big, I see no reason to challenge that.
But multi-threaded is a key here. As anyone who has done multi-threaded programming knows, the challenges you face there are different to the problems you face in traditional, sequential programming. I am thinking about locking, trying to avoid deadlocks, trying to avoid locking data in the first place, and things like that. It's worth remembering that Microsoft SQL Server has a different locking strategy than some other databases (depending on what you've used before), so learning when to use (and when not to use)
READ UNCOMMITTED
can be critical sometimes. More information: Why use a READ UNCOMMITTED isolation level?
You also write in your question:
Is there any way to set up my own large SQL Server with some big public data to practice some of this?
I'm not sure I would recommend this. I mean, if you don't know a subject, you can't really fake it, and trying to learn more complex topics just for an interview seldom works really well. (sorry to sound a bit negative here)
It also depends greatly on the company where you're being interviewed. I've experienced having to go through coding test, phone interview plus challenging five-hour on-site interview and still don't get the job... while at another company, I just had lunch with the manager (whom I knew a bit from beforehand) and then a phone interview with a couple of key people + emailing a bit back and forth until I got an offer. So the requirements and the process varies greatly between companies.
I believe in honesty, even when it comes to job seeking. Be yourself, and don't be afraid of the areas where you feel you are lacking. If you have many years of Java experience, I'm sure you can learn the .NET/C# mindset quite easily, especially if you have worked with multi-threaded programming, perhaps dependency injection/inversion control frameworks (Spring etc).
(I agree that this is a borderline question between careers/workplace and SO, but OTOH the question is more on a technical/software developer level so let's attempt to answer it.)
First, congratulations to the opportunity! I recently switched jobs from one technology stack (Ruby and some C#) to a rather different stack (large Java application with 10+ years on its neck), so I can definitely understand if this feels both challenging and a bit frightening at the same time.
So let's focus on some aspects here:
You have done SQL queries before, I assume you have a general understanding of relational algebra,
INNER
&OUTER
joins, the importance of proper indexes to avoid unnecessary table scans - this kind of basic stuff. (if not, it's definitely areas where you could train a bit.)
You talk about doing "multi-threaded queries on a massive SQL server database" - potentially billions of rows. I cannot say if this is likely or not; most systems do not have such big databases, but if you have had indications about it being that big, I see no reason to challenge that.
But multi-threaded is a key here. As anyone who has done multi-threaded programming knows, the challenges you face there are different to the problems you face in traditional, sequential programming. I am thinking about locking, trying to avoid deadlocks, trying to avoid locking data in the first place, and things like that. It's worth remembering that Microsoft SQL Server has a different locking strategy than some other databases (depending on what you've used before), so learning when to use (and when not to use)
READ UNCOMMITTED
can be critical sometimes. More information: Why use a READ UNCOMMITTED isolation level?
You also write in your question:
Is there any way to set up my own large SQL Server with some big public data to practice some of this?
I'm not sure I would recommend this. I mean, if you don't know a subject, you can't really fake it, and trying to learn more complex topics just for an interview seldom works really well. (sorry to sound a bit negative here)
It also depends greatly on the company where you're being interviewed. I've experienced having to go through coding test, phone interview plus challenging five-hour on-site interview and still don't get the job... while at another company, I just had lunch with the manager (whom I knew a bit from beforehand) and then a phone interview with a couple of key people + emailing a bit back and forth until I got an offer. So the requirements and the process varies greatly between companies.
I believe in honesty, even when it comes to job seeking. Be yourself, and don't be afraid of the areas where you feel you are lacking. If you have many years of Java experience, I'm sure you can learn the .NET/C# mindset quite easily, especially if you have worked with multi-threaded programming, perhaps dependency injection/inversion control frameworks (Spring etc).
answered Nov 16 '18 at 21:09
Per LundbergPer Lundberg
1,81312433
1,81312433
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
add a comment |
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
1
1
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
just wanted to say thanks for the lengthy and thorough response
– user26270
Nov 19 '18 at 14:00
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345076%2fhow-to-practice-large-sql-server-index-or-partition-issues-for-interview%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Of course. You can use adventureworks, the sample database from MS. There are some other databases you can use too. But what is your definition of big? The scale of data these days makes "lots of data" incredibly ambiguous. To be honest, any decent interviewer will see through your preparation and expose the lack of knowledge depth pretty easily. And if not in the interview it will happen on the job pretty fast.
– Sean Lange
Nov 16 '18 at 20:52
1
Stack overflow also has a data download at archive.org/details/stackexchange and brent ozar has a nice intro on it brentozar.com/archive/category/tools/stack-overflow-database
– klabranche
Nov 16 '18 at 21:13