String reversal in PythonReversing every word in a stringFinding the longest unique sub-string in a...
What does a stand alone "T" index value do?
Who deserves to be first and second author? PhD student who collected data, research associate who wrote the paper or supervisor?
Why would one plane in this picture not have gear down yet?
Low budget alien movie about the Earth being cooked
Grey hair or white hair
Is "history" a male-biased word ("his+story")?
How could our ancestors have domesticated a solitary predator?
Can you reject a postdoc offer after the PI has paid a large sum for flights/accommodation for your visit?
Accountant/ lawyer will not return my call
Good allowance savings plan?
Aliens englobed the Solar System: will we notice?
Can someone explain what is being said here in color publishing in the American Mathematical Monthly?
How did Alan Turing break the enigma code using the hint given by the lady in the bar?
Could you please stop shuffling the deck and play already?
If the Captain's screens are out, does he switch seats with the co-pilot?
Should I take out a loan for a friend to invest on my behalf?
Latest web browser compatible with Windows 98
Replacing Windows 7 security updates with anti-virus?
Does "variables should live in the smallest scope as possible" include the case "variables should not exist if possible"?
Why does the negative sign arise in this thermodynamic relation?
Are the terms "stab" and "staccato" synonyms?
The bar has been raised
Is Gradient Descent central to every optimizer?
Do items de-spawn in Diablo?
String reversal in Python
Reversing every word in a stringFinding the longest unique sub-string in a stringMethod to replace all spaces in a String with '%20'Finding the fastest common prefix of 2 strings in PythonFinding longest common prefixURLify a string using PythonLargest substring which starts and ends with some substringFind smallest subset prefixesReversing the vowels in a StringString reversing x times
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
|
show 1 more comment
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago
|
show 1 more comment
$begingroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
New contributor
$endgroup$
This code will take as output any string (long, empty, with spaces...) and will return its reverse. Can I improve this algorithm so that it can be faster?
Right now its complexity is $O(n)$.
def reverse(stri):
output = ''
length = len(stri)
while length > 0:
output += stri[-1]
stri, length = (stri[0:length - 1], length - 1)
return output
python performance strings
python performance strings
New contributor
New contributor
edited 1 hour ago
Jamal♦
30.4k11121227
30.4k11121227
New contributor
asked yesterday
MidosMidos
8116
8116
New contributor
New contributor
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago
|
show 1 more comment
11
$begingroup$
I'd say your code isO(n**2)
because it creates at least n strings with length between1
andn
.
$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazyReverseString
class, which only accesses the data when needed.
$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago
11
11
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between 1
and n
.$endgroup$
– Eric Duminil
yesterday
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between 1
and n
.$endgroup$
– Eric Duminil
yesterday
1
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
2
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago
|
show 1 more comment
3 Answers
3
active
oldest
votes
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215179%2fstring-reversal-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
$endgroup$
Yes, this can be faster. Adding strings using +
is usually a bad idea in Python, since strings are immutable. This means that whenever you add two strings, a new string needs to be allocated with the size of the resulting strings and then both string contents need to be copied there. Instead you usually want to build a list of strings and ''.join
them at the end (where you pay this cost only once).
But here you can just use the fact that strings can be sliced and you can specify a negative step:
def reverse_g(s):
return s[::-1]
Here is a timing comparison for random strings of length up to 1M characters, where reverse
is your function and reverse_g
is this one using slicing. Note the double-log scale, for the largest string your function is almost a hundred thousand times slower.
The reverse_s
function uses the reversed
built-in, as suggested in the answer by @sleblanc and assumes you actually need the reversed string and not just an iterator over it:
def reverse_s(s):
return ''.join(reversed(s))
edited 8 hours ago
answered yesterday
GraipherGraipher
25.9k54089
25.9k54089
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 forreverse_g
significant and if so why?
$endgroup$
– gerrit
yesterday
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
$begingroup$
@IsmaelMiguel: As mentioned in the textreverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.
$endgroup$
– Graipher
17 hours ago
2
2
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 for
reverse_g
significant and if so why?$endgroup$
– gerrit
yesterday
$begingroup$
How did you measure the uncertainty bars, is the drop in processing time with increasing string length near length 10 for
reverse_g
significant and if so why?$endgroup$
– gerrit
yesterday
2
2
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
$begingroup$
@gerrit: The uncertainty bars come from the fact that each timing measurement is performed three times, so the value is the mean and the uncertainty the uncertainty of the mean (i.e. standard deviation / sqrt(n)). So I would doubt that the drop is significant. Sometimes you get large increases due to some other activity on the machine, so that is what could have happened in the first case.
$endgroup$
– Graipher
yesterday
2
2
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
$begingroup$
Data is code and code is data. Ponder why you might want to reverse the string, and consider instead to use an iterator that simply reads the string backwards. This is essentially what this answer is, while the actual implementation the iterator is buried inside CPython's internals. Nice answer
$endgroup$
– sleblanc
yesterday
2
2
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
$begingroup$
Adding strings using + is usually a bad idea in Python, since strings are immutable. Immutability (and the use of Python) isn't what matters here; in nearly every case, string addition requires iterating over at least one of the strings. A pre-allocated, mutable C-string concatenation still requires linear time to copy the contents of the second string into the tail of the first.
$endgroup$
– Schism
yesterday
1
1
$begingroup$
@IsmaelMiguel: As mentioned in the text
reverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.$endgroup$
– Graipher
17 hours ago
$begingroup$
@IsmaelMiguel: As mentioned in the text
reverse_g
is the function with the slicing, i.e. the one in the answer. Renamed it so it is the correct name now, though.$endgroup$
– Graipher
17 hours ago
|
show 7 more comments
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
add a comment |
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
add a comment |
$begingroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
$endgroup$
In terms of pure complexity, the answer is simple: No, it is not possible to reverse a string faster than O(n). That is the theoretical limit when you look at the pure algorithm.
However, your code does not achieve that because the operations in the loop are not O(1). For instance, output += stri[-1]
does not do what you think it does. Python is a very high level language that does a lot of strange things under the hood compared to languages such as C. Strings are immutable in Python, which means that each time this line is executed, a completely new string is created.
If you really need the speed, you could consider writing a C function and call it from Python. Here is an example:
rev.c:
#include <stddef.h>
void reverse(char * stro, char * stri, size_t length) {
for(size_t i=0; i<length; i++) stro[i]=stri[length-1-i];
stro[length]='';
}
Compile the above function with this command:
gcc -o rev.so -shared -fPIC rev.c
And here is a python script using that function.
rev.py:
from ctypes import *
revlib = cdll.LoadLibrary("rev.so");
reverse = revlib.reverse
reverse.argtypes = [c_char_p, c_char_p, c_size_t]
hello = "HelloWorld"
stri = create_string_buffer(hello)
stro = create_string_buffer(b'00' * (len(hello)+1))
reverse(stro, stri, len(stri)-1)
print(repr(stri.value))
print(repr(stro.value))
Please note that I'm by no means an expert on this. I tested this with string of length 10⁸, and I tried the method from Graipher, calling the C function from Python and calling the C function from C. I used -O3
optimization. When I did not use any optimization it was slower to call the C function from Python. Also note that I did NOT include the time it took to create the buffers.
stri[::-1] : 0.98s
calling reverse from python : 0.59s
calling reverse from c: 0.06s
It's not a huge improvement, but it is an improvement. But the pure C program was WAY faster. The main function I used was this one:
int __attribute__((optimize("0"))) // Disable optimization for main
main(int argc, char ** argv) { // so that reverse is not inlined
const size_t size = 1e9;
char * str = malloc(size+1);
static const char alphanum[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
// Take data from outside the program to fool the optimizer
alphanum[atoi(argv[1])]='7';
// Load string with random data to fool the optimizer
srand(time(NULL));
for (size_t i = 0; i < size; ++i) {
str[i] = alphanum[rand() % (sizeof(alphanum) - 1)];
}
char *o = malloc(size+1);
reverse(o, str, strlen(str));
// Do something with the data to fool the optimizer
for(size_t i=0; i<size; i++)
if(str[i] != o[size-i-1]) {
printf("Errorn");
exit(1);
}
}
Then, to get the runtime I ran:
gcc -O3 -pg rev.c; ./a.out; gprof a.out gmon.out | head -n8
edited 7 hours ago
answered 21 hours ago
BromanBroman
35419
35419
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
add a comment |
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
Thus, this answer is simply wrong as it doesn't address the issue with the presented code
$endgroup$
– PascalVKooten
12 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
$begingroup$
@PascalVKooten Fixed
$endgroup$
– Broman
7 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
add a comment |
$begingroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
$endgroup$
Here's a very short code I would use to reverse any string.
a = 'Coding is fun'
rev = print("Reverse of this statement is "+ a[::-1])
New contributor
New contributor
answered 9 hours ago
DaachiDaachi
1
1
New contributor
New contributor
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
add a comment |
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
4
4
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
$begingroup$
This adds nothing to Graipher's answer.
$endgroup$
– Broman
9 hours ago
add a comment |
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Midos is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215179%2fstring-reversal-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
11
$begingroup$
I'd say your code is
O(n**2)
because it creates at least n strings with length between1
andn
.$endgroup$
– Eric Duminil
yesterday
1
$begingroup$
I think you're right, since I didn't take into account the immutability of strings in python, but the graph of @Graipher shows something similar to O(nlgn) complexity for my code, I don't know I'm a bit confused...
$endgroup$
– Midos
yesterday
1
$begingroup$
Depending on how large the string is and what you want to do with it, it might be a good idea to create a lazy
ReverseString
class, which only accesses the data when needed.$endgroup$
– Eric Duminil
yesterday
2
$begingroup$
@Midos That might be due to the Python implementation used. AFAIK, cPython does sometimes manage to reuse one of the two strings (i.e. it tries to reserve enough space after/before one of the two strings and needs to copy only one string, but don't quote me on that). However, that is both implementation dependent and potentially dependent on the available memory, which is why Python's official style-guide recommends not relying on it.
$endgroup$
– Graipher
yesterday
$begingroup$
@Midos I recommend having a look at my answer now. It shows how you can speed things up with C functions that you call from Python.
$endgroup$
– Broman
7 hours ago