How to Selection Sort - Assembly 8086
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I've created a procedure to selection sort a word vector but there's a problem: the sorting is totally wrong.
My vector: VET_2 DW 2, 7, 0, 1, 4, 8, 9, 3, 6, 5
; Selection Sort
SELECTION_SORT PROC
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
; ----- start for(int i = 0; i < n-1; i++) -----
SLC_LOOP_FORA: ; outer loop
CALL RESET_REGIST ; reset some AX, BX, CX & DX
CALL RESET_VAR ; used to reset AUX
POP CX ; initialize i
CMP CX, 18 ; check if it's smaller than n-1 (20-2=18)
JGE SLC_FIM ; if bigger, goes to the end
MOV BX, CX ; offset receive i, the position of the smaller
; ----- start j = i+1 -----
MOV AX, CX ; AX = j.
ADD AX, 2 ; j = i+1
; ----- end j = i+1 -----
; ----- start for(int j = i+1; j < n; j++) -----
SLC_LOOP_DENTRO: ; inner loop
MOV DX, [SI+BX] ; move the smaller element to DX
MOV BX, AX ; offset receives j
CMP DX, [SI+BX] ; compare if VET_2[min]<=VET_2[j]
JL SLC_DENTRO_PULAR ; if lesser, ignore the code below
MOV BX, AX ; offset receive j, position of the smaller element
SLC_DENTRO_PULAR:
ADD AX, 2 ; inc 2 in j
CMP AX, 20 ; compare j (ax) with n
JL SLC_LOOP_DENTRO ; if it's smaller, repeat inner loop
; ----- end for(int j = n+1; j < n; j++) -----
CMP CX, BX ; compare i with the position of the smaller element
JE SLC_LOOP_FORA ; if equals, repeat outer loop, otherwise do the swap
PUSH BX ; position of the smaller element
PUSH [SI+BX] ; put vet[min] top of the stack
; ----- start aux = vet[i] -----
MOV BX, CX ; offset (BX) receives i
MOV DX, [SI+BX] ; DX receives vet_2[i]
MOV AUX, DX ; AUX receives DX
; ----- end aux = vet[i] -----
; ----- start vet[i] = vet[min] -----
POP AX ; AX receives the top of the stack (vet_2[min])
MOV [SI+BX], AX ; vet_2[i] receives DX (smaller element)
; ----- end vet[i] = vet[min] -----
; ----- start vet[min] = aux -----
POP BX ; offset (BX) receives the position of the smaller element from the stack
MOV DX, AUX ; DX receives AUX
MOV [SI+BX], DX ; vet_2[min] receives DX
; ----- end vet[min] = aux -----
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA repeat outer loop
; ----- end for(int i = 0; i < n-1; i++) -----
SLC_FIM: ; end the procedure
RET
SELECTION_SORT ENDP
Before call selection sort procedure: 2 7 0 1 4 8 9 3 6 5
After call selection sort procedure: 5 2 7 0 1 4 8 9 3 6
Where is the error? Can somebody help me?
assembly x86-16 selection-sort
add a comment |
I've created a procedure to selection sort a word vector but there's a problem: the sorting is totally wrong.
My vector: VET_2 DW 2, 7, 0, 1, 4, 8, 9, 3, 6, 5
; Selection Sort
SELECTION_SORT PROC
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
; ----- start for(int i = 0; i < n-1; i++) -----
SLC_LOOP_FORA: ; outer loop
CALL RESET_REGIST ; reset some AX, BX, CX & DX
CALL RESET_VAR ; used to reset AUX
POP CX ; initialize i
CMP CX, 18 ; check if it's smaller than n-1 (20-2=18)
JGE SLC_FIM ; if bigger, goes to the end
MOV BX, CX ; offset receive i, the position of the smaller
; ----- start j = i+1 -----
MOV AX, CX ; AX = j.
ADD AX, 2 ; j = i+1
; ----- end j = i+1 -----
; ----- start for(int j = i+1; j < n; j++) -----
SLC_LOOP_DENTRO: ; inner loop
MOV DX, [SI+BX] ; move the smaller element to DX
MOV BX, AX ; offset receives j
CMP DX, [SI+BX] ; compare if VET_2[min]<=VET_2[j]
JL SLC_DENTRO_PULAR ; if lesser, ignore the code below
MOV BX, AX ; offset receive j, position of the smaller element
SLC_DENTRO_PULAR:
ADD AX, 2 ; inc 2 in j
CMP AX, 20 ; compare j (ax) with n
JL SLC_LOOP_DENTRO ; if it's smaller, repeat inner loop
; ----- end for(int j = n+1; j < n; j++) -----
CMP CX, BX ; compare i with the position of the smaller element
JE SLC_LOOP_FORA ; if equals, repeat outer loop, otherwise do the swap
PUSH BX ; position of the smaller element
PUSH [SI+BX] ; put vet[min] top of the stack
; ----- start aux = vet[i] -----
MOV BX, CX ; offset (BX) receives i
MOV DX, [SI+BX] ; DX receives vet_2[i]
MOV AUX, DX ; AUX receives DX
; ----- end aux = vet[i] -----
; ----- start vet[i] = vet[min] -----
POP AX ; AX receives the top of the stack (vet_2[min])
MOV [SI+BX], AX ; vet_2[i] receives DX (smaller element)
; ----- end vet[i] = vet[min] -----
; ----- start vet[min] = aux -----
POP BX ; offset (BX) receives the position of the smaller element from the stack
MOV DX, AUX ; DX receives AUX
MOV [SI+BX], DX ; vet_2[min] receives DX
; ----- end vet[min] = aux -----
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA repeat outer loop
; ----- end for(int i = 0; i < n-1; i++) -----
SLC_FIM: ; end the procedure
RET
SELECTION_SORT ENDP
Before call selection sort procedure: 2 7 0 1 4 8 9 3 6 5
After call selection sort procedure: 5 2 7 0 1 4 8 9 3 6
Where is the error? Can somebody help me?
assembly x86-16 selection-sort
2
POP CX
inside a loop looks suspicious to me, because youpush 0
outside the loop, but there's aJE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space.xor cx,cx
ormov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like[bp-4]
after making a stack frame. (You can't use[SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.
– Peter Cordes
Nov 15 '18 at 16:54
1
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28
add a comment |
I've created a procedure to selection sort a word vector but there's a problem: the sorting is totally wrong.
My vector: VET_2 DW 2, 7, 0, 1, 4, 8, 9, 3, 6, 5
; Selection Sort
SELECTION_SORT PROC
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
; ----- start for(int i = 0; i < n-1; i++) -----
SLC_LOOP_FORA: ; outer loop
CALL RESET_REGIST ; reset some AX, BX, CX & DX
CALL RESET_VAR ; used to reset AUX
POP CX ; initialize i
CMP CX, 18 ; check if it's smaller than n-1 (20-2=18)
JGE SLC_FIM ; if bigger, goes to the end
MOV BX, CX ; offset receive i, the position of the smaller
; ----- start j = i+1 -----
MOV AX, CX ; AX = j.
ADD AX, 2 ; j = i+1
; ----- end j = i+1 -----
; ----- start for(int j = i+1; j < n; j++) -----
SLC_LOOP_DENTRO: ; inner loop
MOV DX, [SI+BX] ; move the smaller element to DX
MOV BX, AX ; offset receives j
CMP DX, [SI+BX] ; compare if VET_2[min]<=VET_2[j]
JL SLC_DENTRO_PULAR ; if lesser, ignore the code below
MOV BX, AX ; offset receive j, position of the smaller element
SLC_DENTRO_PULAR:
ADD AX, 2 ; inc 2 in j
CMP AX, 20 ; compare j (ax) with n
JL SLC_LOOP_DENTRO ; if it's smaller, repeat inner loop
; ----- end for(int j = n+1; j < n; j++) -----
CMP CX, BX ; compare i with the position of the smaller element
JE SLC_LOOP_FORA ; if equals, repeat outer loop, otherwise do the swap
PUSH BX ; position of the smaller element
PUSH [SI+BX] ; put vet[min] top of the stack
; ----- start aux = vet[i] -----
MOV BX, CX ; offset (BX) receives i
MOV DX, [SI+BX] ; DX receives vet_2[i]
MOV AUX, DX ; AUX receives DX
; ----- end aux = vet[i] -----
; ----- start vet[i] = vet[min] -----
POP AX ; AX receives the top of the stack (vet_2[min])
MOV [SI+BX], AX ; vet_2[i] receives DX (smaller element)
; ----- end vet[i] = vet[min] -----
; ----- start vet[min] = aux -----
POP BX ; offset (BX) receives the position of the smaller element from the stack
MOV DX, AUX ; DX receives AUX
MOV [SI+BX], DX ; vet_2[min] receives DX
; ----- end vet[min] = aux -----
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA repeat outer loop
; ----- end for(int i = 0; i < n-1; i++) -----
SLC_FIM: ; end the procedure
RET
SELECTION_SORT ENDP
Before call selection sort procedure: 2 7 0 1 4 8 9 3 6 5
After call selection sort procedure: 5 2 7 0 1 4 8 9 3 6
Where is the error? Can somebody help me?
assembly x86-16 selection-sort
I've created a procedure to selection sort a word vector but there's a problem: the sorting is totally wrong.
My vector: VET_2 DW 2, 7, 0, 1, 4, 8, 9, 3, 6, 5
; Selection Sort
SELECTION_SORT PROC
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
; ----- start for(int i = 0; i < n-1; i++) -----
SLC_LOOP_FORA: ; outer loop
CALL RESET_REGIST ; reset some AX, BX, CX & DX
CALL RESET_VAR ; used to reset AUX
POP CX ; initialize i
CMP CX, 18 ; check if it's smaller than n-1 (20-2=18)
JGE SLC_FIM ; if bigger, goes to the end
MOV BX, CX ; offset receive i, the position of the smaller
; ----- start j = i+1 -----
MOV AX, CX ; AX = j.
ADD AX, 2 ; j = i+1
; ----- end j = i+1 -----
; ----- start for(int j = i+1; j < n; j++) -----
SLC_LOOP_DENTRO: ; inner loop
MOV DX, [SI+BX] ; move the smaller element to DX
MOV BX, AX ; offset receives j
CMP DX, [SI+BX] ; compare if VET_2[min]<=VET_2[j]
JL SLC_DENTRO_PULAR ; if lesser, ignore the code below
MOV BX, AX ; offset receive j, position of the smaller element
SLC_DENTRO_PULAR:
ADD AX, 2 ; inc 2 in j
CMP AX, 20 ; compare j (ax) with n
JL SLC_LOOP_DENTRO ; if it's smaller, repeat inner loop
; ----- end for(int j = n+1; j < n; j++) -----
CMP CX, BX ; compare i with the position of the smaller element
JE SLC_LOOP_FORA ; if equals, repeat outer loop, otherwise do the swap
PUSH BX ; position of the smaller element
PUSH [SI+BX] ; put vet[min] top of the stack
; ----- start aux = vet[i] -----
MOV BX, CX ; offset (BX) receives i
MOV DX, [SI+BX] ; DX receives vet_2[i]
MOV AUX, DX ; AUX receives DX
; ----- end aux = vet[i] -----
; ----- start vet[i] = vet[min] -----
POP AX ; AX receives the top of the stack (vet_2[min])
MOV [SI+BX], AX ; vet_2[i] receives DX (smaller element)
; ----- end vet[i] = vet[min] -----
; ----- start vet[min] = aux -----
POP BX ; offset (BX) receives the position of the smaller element from the stack
MOV DX, AUX ; DX receives AUX
MOV [SI+BX], DX ; vet_2[min] receives DX
; ----- end vet[min] = aux -----
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA repeat outer loop
; ----- end for(int i = 0; i < n-1; i++) -----
SLC_FIM: ; end the procedure
RET
SELECTION_SORT ENDP
Before call selection sort procedure: 2 7 0 1 4 8 9 3 6 5
After call selection sort procedure: 5 2 7 0 1 4 8 9 3 6
Where is the error? Can somebody help me?
assembly x86-16 selection-sort
assembly x86-16 selection-sort
edited Nov 17 '18 at 14:37
Fifoernik
7,85911324
7,85911324
asked Nov 15 '18 at 16:35
Erick MatheusErick Matheus
211
211
2
POP CX
inside a loop looks suspicious to me, because youpush 0
outside the loop, but there's aJE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space.xor cx,cx
ormov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like[bp-4]
after making a stack frame. (You can't use[SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.
– Peter Cordes
Nov 15 '18 at 16:54
1
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28
add a comment |
2
POP CX
inside a loop looks suspicious to me, because youpush 0
outside the loop, but there's aJE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space.xor cx,cx
ormov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like[bp-4]
after making a stack frame. (You can't use[SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.
– Peter Cordes
Nov 15 '18 at 16:54
1
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28
2
2
POP CX
inside a loop looks suspicious to me, because you push 0
outside the loop, but there's a JE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space. xor cx,cx
or mov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like [bp-4]
after making a stack frame. (You can't use [SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.– Peter Cordes
Nov 15 '18 at 16:54
POP CX
inside a loop looks suspicious to me, because you push 0
outside the loop, but there's a JE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space. xor cx,cx
or mov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like [bp-4]
after making a stack frame. (You can't use [SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.– Peter Cordes
Nov 15 '18 at 16:54
1
1
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28
add a comment |
3 Answers
3
active
oldest
votes
Problem
When you don't need to swap 2 elements, you still need to raise the lower bound in CX
. The cmp cx, bx
je SLC_LOOP_FORA
neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).
Solution:
The problem (included the stack un-balance) is easily corrected by introducing an extra label:
PUSH 0 ; to initialize i
MOV SI, OFFSET VET_2
SLC_LOOP_FORA: ; outer loop
...
CMP CX, BX ; compare i with the position of the smaller element
JE NO_SWAP ; if equals, repeat outer loop, otherwise do the swap
...
NO_SWAP:
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA ; repeat outer loop
Consider
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
If you were willing to re-assign registers, you could enormously simplify this program.
With register assignments like
; AX = j & aux DI = i
; SI = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV BX, OFFSET VET_2
the swap code e.g. would become just these 3 instructions:
mov ax, [bx+si]
xchg ax, [bx+di]
mov [bx+si], ax
add a comment |
MOV SI, [OFFSET VET_2]
I've never seen an assembler that would do it this way! The above sets the SI
register equal to the contents of the first array element, so SI=2
. Not really useful.
Acceptable instructions to get the address of your vector are:
mov si, offset VET_2
or
lea si, [VET_2]
or
lea si, VET_2
2
According to Ross' answer on Confusing brackets in MASM32mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you useds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)
– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
add a comment |
selection_sort_i64:
; /*
; Input:
; RDI = long long * array
; RSI = length
;
; Pseudo-C code:
;
; for (int i = 0; i < n - 1; ++i) {
; min = i
; for (int j = i + 1; j < n; ++j) {
; if a[j] < a[min]; min = j
; swap(i, min)
;*/
I64_SIZE equ 8
LG_I64_SIZE equ 3
cmp rsi, 1
jle .end ; Just end it if length <= 1
xchg rsi, rdi ; rsi => left pointer
mov rcx, rdi ; rcx => length
; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]
; /*
; Let's explain what's happening here.
; RSI will be &a[i], RDX will be it's right boundary
; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
; RCX will be n-i and will be the counter for the inner loop
; RAX will track our minimum in the remaining of the array
; RBX will
; */
.outer_loop:
cmp rsi, rdx
jge .end ; while (i < n - 1) {
mov rax, [rsi] ; min = a[i]
mov rbx, rsi ; min_i = i
push rax
mov rdi, rsi
add rdi, I64_SIZE ; j = i + 1 // rdi => right pointer
dec rcx ; // rcx = n - i, inner loop counter
push rcx
.inner_loop:
cmp rax, [rdi] ; if (min > a[j])
jle .min_less_or_equal
mov rbx, rdi ; min_i = j
mov rax, [rdi] ; min = a[j]
.min_less_or_equal:
add rdi, I64_SIZE ; j += 1
loop .inner_loop
pop rcx ; // restore inner loop counter
pop rax ; // restore a[i]
cmp rsi, rbx ; // swap if minimum found
jz .no_min ; if (i != min_i)
mov rdi, [rbx] ; temp = a[min]
mov [rbx], rax ; a[min] = a[i]
mov [rsi], rdi ; a[i] = temp
.no_min:
add rsi, I64_SIZE ; i += 1
jmp .outer_loop ; } // end outer loop
.end:
ret
Hope this works. Thanks.
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes andI64_SIZE
.
– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. Theloop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?
– Peter Cordes
Nov 16 '18 at 15:28
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323996%2fhow-to-selection-sort-assembly-8086%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Problem
When you don't need to swap 2 elements, you still need to raise the lower bound in CX
. The cmp cx, bx
je SLC_LOOP_FORA
neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).
Solution:
The problem (included the stack un-balance) is easily corrected by introducing an extra label:
PUSH 0 ; to initialize i
MOV SI, OFFSET VET_2
SLC_LOOP_FORA: ; outer loop
...
CMP CX, BX ; compare i with the position of the smaller element
JE NO_SWAP ; if equals, repeat outer loop, otherwise do the swap
...
NO_SWAP:
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA ; repeat outer loop
Consider
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
If you were willing to re-assign registers, you could enormously simplify this program.
With register assignments like
; AX = j & aux DI = i
; SI = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV BX, OFFSET VET_2
the swap code e.g. would become just these 3 instructions:
mov ax, [bx+si]
xchg ax, [bx+di]
mov [bx+si], ax
add a comment |
Problem
When you don't need to swap 2 elements, you still need to raise the lower bound in CX
. The cmp cx, bx
je SLC_LOOP_FORA
neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).
Solution:
The problem (included the stack un-balance) is easily corrected by introducing an extra label:
PUSH 0 ; to initialize i
MOV SI, OFFSET VET_2
SLC_LOOP_FORA: ; outer loop
...
CMP CX, BX ; compare i with the position of the smaller element
JE NO_SWAP ; if equals, repeat outer loop, otherwise do the swap
...
NO_SWAP:
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA ; repeat outer loop
Consider
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
If you were willing to re-assign registers, you could enormously simplify this program.
With register assignments like
; AX = j & aux DI = i
; SI = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV BX, OFFSET VET_2
the swap code e.g. would become just these 3 instructions:
mov ax, [bx+si]
xchg ax, [bx+di]
mov [bx+si], ax
add a comment |
Problem
When you don't need to swap 2 elements, you still need to raise the lower bound in CX
. The cmp cx, bx
je SLC_LOOP_FORA
neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).
Solution:
The problem (included the stack un-balance) is easily corrected by introducing an extra label:
PUSH 0 ; to initialize i
MOV SI, OFFSET VET_2
SLC_LOOP_FORA: ; outer loop
...
CMP CX, BX ; compare i with the position of the smaller element
JE NO_SWAP ; if equals, repeat outer loop, otherwise do the swap
...
NO_SWAP:
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA ; repeat outer loop
Consider
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
If you were willing to re-assign registers, you could enormously simplify this program.
With register assignments like
; AX = j & aux DI = i
; SI = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV BX, OFFSET VET_2
the swap code e.g. would become just these 3 instructions:
mov ax, [bx+si]
xchg ax, [bx+di]
mov [bx+si], ax
Problem
When you don't need to swap 2 elements, you still need to raise the lower bound in CX
. The cmp cx, bx
je SLC_LOOP_FORA
neglects this. Moreover it leaves you with an unbalanced stack (popping more than was pushed).
Solution:
The problem (included the stack un-balance) is easily corrected by introducing an extra label:
PUSH 0 ; to initialize i
MOV SI, OFFSET VET_2
SLC_LOOP_FORA: ; outer loop
...
CMP CX, BX ; compare i with the position of the smaller element
JE NO_SWAP ; if equals, repeat outer loop, otherwise do the swap
...
NO_SWAP:
ADD CX, 2 ; INC 2 on i
PUSH CX ; put in the stack
JMP SLC_LOOP_FORA ; repeat outer loop
Consider
; AX = j & aux CX = i
; BX = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV SI, [OFFSET VET_2]
If you were willing to re-assign registers, you could enormously simplify this program.
With register assignments like
; AX = j & aux DI = i
; SI = offset/min DX = data and others
PUSH 0 ; to initialize i
MOV BX, OFFSET VET_2
the swap code e.g. would become just these 3 instructions:
mov ax, [bx+si]
xchg ax, [bx+di]
mov [bx+si], ax
answered Nov 25 '18 at 22:51
Sep RolandSep Roland
12.5k22047
12.5k22047
add a comment |
add a comment |
MOV SI, [OFFSET VET_2]
I've never seen an assembler that would do it this way! The above sets the SI
register equal to the contents of the first array element, so SI=2
. Not really useful.
Acceptable instructions to get the address of your vector are:
mov si, offset VET_2
or
lea si, [VET_2]
or
lea si, VET_2
2
According to Ross' answer on Confusing brackets in MASM32mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you useds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)
– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
add a comment |
MOV SI, [OFFSET VET_2]
I've never seen an assembler that would do it this way! The above sets the SI
register equal to the contents of the first array element, so SI=2
. Not really useful.
Acceptable instructions to get the address of your vector are:
mov si, offset VET_2
or
lea si, [VET_2]
or
lea si, VET_2
2
According to Ross' answer on Confusing brackets in MASM32mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you useds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)
– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
add a comment |
MOV SI, [OFFSET VET_2]
I've never seen an assembler that would do it this way! The above sets the SI
register equal to the contents of the first array element, so SI=2
. Not really useful.
Acceptable instructions to get the address of your vector are:
mov si, offset VET_2
or
lea si, [VET_2]
or
lea si, VET_2
MOV SI, [OFFSET VET_2]
I've never seen an assembler that would do it this way! The above sets the SI
register equal to the contents of the first array element, so SI=2
. Not really useful.
Acceptable instructions to get the address of your vector are:
mov si, offset VET_2
or
lea si, [VET_2]
or
lea si, VET_2
answered Nov 17 '18 at 14:47
FifoernikFifoernik
7,85911324
7,85911324
2
According to Ross' answer on Confusing brackets in MASM32mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you useds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)
– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
add a comment |
2
According to Ross' answer on Confusing brackets in MASM32mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you useds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)
– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
2
2
According to Ross' answer on Confusing brackets in MASM32
mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you use ds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)– Peter Cordes
Nov 17 '18 at 14:58
According to Ross' answer on Confusing brackets in MASM32
mov si, [OFFSET VET_2]
is still an immediate. MASM apparently only respects square brackets around numeric literals, or if you use ds:[stuff]
. (Seems insane and terrible, and I wouldn't recommend ever using square brackets around an immediate except as intentional obfuscation, but if you're stuck using MASM then that's how it works.)– Peter Cordes
Nov 17 '18 at 14:58
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
+1 for exposing this pure madness in MASM. Thanks to @PeterCordes for the link to Ross' answer. I had no idea...
– Sep Roland
Nov 25 '18 at 22:04
add a comment |
selection_sort_i64:
; /*
; Input:
; RDI = long long * array
; RSI = length
;
; Pseudo-C code:
;
; for (int i = 0; i < n - 1; ++i) {
; min = i
; for (int j = i + 1; j < n; ++j) {
; if a[j] < a[min]; min = j
; swap(i, min)
;*/
I64_SIZE equ 8
LG_I64_SIZE equ 3
cmp rsi, 1
jle .end ; Just end it if length <= 1
xchg rsi, rdi ; rsi => left pointer
mov rcx, rdi ; rcx => length
; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]
; /*
; Let's explain what's happening here.
; RSI will be &a[i], RDX will be it's right boundary
; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
; RCX will be n-i and will be the counter for the inner loop
; RAX will track our minimum in the remaining of the array
; RBX will
; */
.outer_loop:
cmp rsi, rdx
jge .end ; while (i < n - 1) {
mov rax, [rsi] ; min = a[i]
mov rbx, rsi ; min_i = i
push rax
mov rdi, rsi
add rdi, I64_SIZE ; j = i + 1 // rdi => right pointer
dec rcx ; // rcx = n - i, inner loop counter
push rcx
.inner_loop:
cmp rax, [rdi] ; if (min > a[j])
jle .min_less_or_equal
mov rbx, rdi ; min_i = j
mov rax, [rdi] ; min = a[j]
.min_less_or_equal:
add rdi, I64_SIZE ; j += 1
loop .inner_loop
pop rcx ; // restore inner loop counter
pop rax ; // restore a[i]
cmp rsi, rbx ; // swap if minimum found
jz .no_min ; if (i != min_i)
mov rdi, [rbx] ; temp = a[min]
mov [rbx], rax ; a[min] = a[i]
mov [rsi], rdi ; a[i] = temp
.no_min:
add rsi, I64_SIZE ; i += 1
jmp .outer_loop ; } // end outer loop
.end:
ret
Hope this works. Thanks.
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes andI64_SIZE
.
– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. Theloop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?
– Peter Cordes
Nov 16 '18 at 15:28
add a comment |
selection_sort_i64:
; /*
; Input:
; RDI = long long * array
; RSI = length
;
; Pseudo-C code:
;
; for (int i = 0; i < n - 1; ++i) {
; min = i
; for (int j = i + 1; j < n; ++j) {
; if a[j] < a[min]; min = j
; swap(i, min)
;*/
I64_SIZE equ 8
LG_I64_SIZE equ 3
cmp rsi, 1
jle .end ; Just end it if length <= 1
xchg rsi, rdi ; rsi => left pointer
mov rcx, rdi ; rcx => length
; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]
; /*
; Let's explain what's happening here.
; RSI will be &a[i], RDX will be it's right boundary
; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
; RCX will be n-i and will be the counter for the inner loop
; RAX will track our minimum in the remaining of the array
; RBX will
; */
.outer_loop:
cmp rsi, rdx
jge .end ; while (i < n - 1) {
mov rax, [rsi] ; min = a[i]
mov rbx, rsi ; min_i = i
push rax
mov rdi, rsi
add rdi, I64_SIZE ; j = i + 1 // rdi => right pointer
dec rcx ; // rcx = n - i, inner loop counter
push rcx
.inner_loop:
cmp rax, [rdi] ; if (min > a[j])
jle .min_less_or_equal
mov rbx, rdi ; min_i = j
mov rax, [rdi] ; min = a[j]
.min_less_or_equal:
add rdi, I64_SIZE ; j += 1
loop .inner_loop
pop rcx ; // restore inner loop counter
pop rax ; // restore a[i]
cmp rsi, rbx ; // swap if minimum found
jz .no_min ; if (i != min_i)
mov rdi, [rbx] ; temp = a[min]
mov [rbx], rax ; a[min] = a[i]
mov [rsi], rdi ; a[i] = temp
.no_min:
add rsi, I64_SIZE ; i += 1
jmp .outer_loop ; } // end outer loop
.end:
ret
Hope this works. Thanks.
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes andI64_SIZE
.
– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. Theloop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?
– Peter Cordes
Nov 16 '18 at 15:28
add a comment |
selection_sort_i64:
; /*
; Input:
; RDI = long long * array
; RSI = length
;
; Pseudo-C code:
;
; for (int i = 0; i < n - 1; ++i) {
; min = i
; for (int j = i + 1; j < n; ++j) {
; if a[j] < a[min]; min = j
; swap(i, min)
;*/
I64_SIZE equ 8
LG_I64_SIZE equ 3
cmp rsi, 1
jle .end ; Just end it if length <= 1
xchg rsi, rdi ; rsi => left pointer
mov rcx, rdi ; rcx => length
; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]
; /*
; Let's explain what's happening here.
; RSI will be &a[i], RDX will be it's right boundary
; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
; RCX will be n-i and will be the counter for the inner loop
; RAX will track our minimum in the remaining of the array
; RBX will
; */
.outer_loop:
cmp rsi, rdx
jge .end ; while (i < n - 1) {
mov rax, [rsi] ; min = a[i]
mov rbx, rsi ; min_i = i
push rax
mov rdi, rsi
add rdi, I64_SIZE ; j = i + 1 // rdi => right pointer
dec rcx ; // rcx = n - i, inner loop counter
push rcx
.inner_loop:
cmp rax, [rdi] ; if (min > a[j])
jle .min_less_or_equal
mov rbx, rdi ; min_i = j
mov rax, [rdi] ; min = a[j]
.min_less_or_equal:
add rdi, I64_SIZE ; j += 1
loop .inner_loop
pop rcx ; // restore inner loop counter
pop rax ; // restore a[i]
cmp rsi, rbx ; // swap if minimum found
jz .no_min ; if (i != min_i)
mov rdi, [rbx] ; temp = a[min]
mov [rbx], rax ; a[min] = a[i]
mov [rsi], rdi ; a[i] = temp
.no_min:
add rsi, I64_SIZE ; i += 1
jmp .outer_loop ; } // end outer loop
.end:
ret
Hope this works. Thanks.
selection_sort_i64:
; /*
; Input:
; RDI = long long * array
; RSI = length
;
; Pseudo-C code:
;
; for (int i = 0; i < n - 1; ++i) {
; min = i
; for (int j = i + 1; j < n; ++j) {
; if a[j] < a[min]; min = j
; swap(i, min)
;*/
I64_SIZE equ 8
LG_I64_SIZE equ 3
cmp rsi, 1
jle .end ; Just end it if length <= 1
xchg rsi, rdi ; rsi => left pointer
mov rcx, rdi ; rcx => length
; RDX will be the boundary for i: RSI + (N-1)*sizeof(int64)
lea rdx, [rsi + rcx * LL_SIZE - LL_SIZE]
; /*
; Let's explain what's happening here.
; RSI will be &a[i], RDX will be it's right boundary
; RDI will be &a[j] (&a[i + 1]) and will loop n-i times
; RCX will be n-i and will be the counter for the inner loop
; RAX will track our minimum in the remaining of the array
; RBX will
; */
.outer_loop:
cmp rsi, rdx
jge .end ; while (i < n - 1) {
mov rax, [rsi] ; min = a[i]
mov rbx, rsi ; min_i = i
push rax
mov rdi, rsi
add rdi, I64_SIZE ; j = i + 1 // rdi => right pointer
dec rcx ; // rcx = n - i, inner loop counter
push rcx
.inner_loop:
cmp rax, [rdi] ; if (min > a[j])
jle .min_less_or_equal
mov rbx, rdi ; min_i = j
mov rax, [rdi] ; min = a[j]
.min_less_or_equal:
add rdi, I64_SIZE ; j += 1
loop .inner_loop
pop rcx ; // restore inner loop counter
pop rax ; // restore a[i]
cmp rsi, rbx ; // swap if minimum found
jz .no_min ; if (i != min_i)
mov rdi, [rbx] ; temp = a[min]
mov [rbx], rax ; a[min] = a[i]
mov [rsi], rdi ; a[i] = temp
.no_min:
add rsi, I64_SIZE ; i += 1
jmp .outer_loop ; } // end outer loop
.end:
ret
Hope this works. Thanks.
edited Nov 16 '18 at 15:23
Peter Cordes
134k19204344
134k19204344
answered Nov 16 '18 at 15:08
Niloy NiilNiloy Niil
11
11
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes andI64_SIZE
.
– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. Theloop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?
– Peter Cordes
Nov 16 '18 at 15:28
add a comment |
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes andI64_SIZE
.
– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. Theloop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?
– Peter Cordes
Nov 16 '18 at 15:28
2
2
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes and
I64_SIZE
.– Peter Cordes
Nov 16 '18 at 15:25
The OP is looking for help debugging their code, not for some random selection-sort implementation. This is an x86-16 question, but at least you didn't use any of r8..r15 or any addressing modes that aren't available in 16-bit mode, so you actually could port it to x86-64 directly just by changing the register sizes and
I64_SIZE
.– Peter Cordes
Nov 16 '18 at 15:25
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. The
loop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?– Peter Cordes
Nov 16 '18 at 15:28
However, in x86-64 you should be using r8 or something for one of the loop counters, so you don't need to push/pop rcx inside the loop. The
loop
instruction is slow; only use it if optimizing for code-size over speed. Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?– Peter Cordes
Nov 16 '18 at 15:28
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53323996%2fhow-to-selection-sort-assembly-8086%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
POP CX
inside a loop looks suspicious to me, because youpush 0
outside the loop, but there's aJE SLC_LOOP_FORA
that can repeat that without pushing anything. So you're going to consume some stack space.xor cx,cx
ormov cx,0
would be a lot more sensible. If you run out of registers for locals, normally you store/reload them to locations like[bp-4]
after making a stack frame. (You can't use[SP+4]
or whatever in 16-bit code). push/pop inside loops is harder to reason about, leading to errors exactly like this.– Peter Cordes
Nov 15 '18 at 16:54
1
Have you tried single-stepping through your code with a debugger to see if it leaves the loop early? And if so, you can see exactly which branch went wrong.
– Peter Cordes
Nov 15 '18 at 17:28