Mariel Frank and Sonny Li, authors of Codecademy’s Learn C++ course, recently got a chance to interview with Dr. Bjarne Stroustrup, the creator of C++. As part of the interview, he answered the highest voted C++ questions on Stack Overflow. While the whole interview is worth a read, Codecademy has generously allowed us to republish the Q&A portion of the interview. If you ever wondered if an answer on Stack Overflow was definitive, here’s about the closest you’ll get to certain (though we expect somebody to disagree).
🥇 Why is processing a sorted array faster than processing an unsorted array?
That sounds like an interview question. Is it true? How would you know? It is a bad idea to answer questions about efficiency without first doing some measurements, so it is important to know how to measure. So, I tried with a vector of a million integers and got:
Already sorted 32995 milliseconds
Shuffled 125944 milliseconds
Already sorted 18610 milliseconds
Shuffled 133304 milliseconds
Already sorted 17942 milliseconds
Shuffled 107858 milliseconds
I ran that a few times to be sure. Yes, the phenomenon is real. My key code was:
void run(vector<int>& v, const string& label)
{
auto t0 = system_clock::now();
sort(v.begin(), v.end());
auto t1 = system_clock::now();
cout << label
<< duration_cast<microseconds>(t1 — t0).count()
<< " milliseconds\n";
}
void tst()
{
vector<int> v(1'000'000);
iota(v.begin(), v.end(), 0);
run(v, "already sorted ");
std::shuffle(v.begin(), v.end(), std::mt19937{ std::random_device{}() });
run(v, "shuffled ");
}
At least the phenomenon is real with this compiler, standard library, and optimizer settings. Different implementations can and do give different answers. In fact, someone did do a more systematic study (a quick web search will find it) and most implementations show that effect. One reason is branch prediction: the key operation in the sort algorithm is “if(v[i] < pivot]) …” or equivalent. For a sorted sequence that test is always true whereas, for a random sequence, the branch chosen varies randomly. Another reason is that when the vector is already sorted, we never need to move elements to their correct position. The effect of these little details is the factor of five or six that we saw. Quicksort (and sorting in general) is a complex study that has attracted some of the greatest minds of computer science. A good sort function is a result of both choosing a good algorithm and paying attention to hardware performance in its implementation. If you want to write efficient code, you need to know a bit about machine architecture.
------------------------------------------------------------------------------------------------------------------- Pardon the interruption. Just here to say, the Stack Overflow Podcast is back! Come check us out and hear an interview with our new CEO.
🥈 What is the --> operator in C++?
That’s an old trick question. There is no --> operator in C++. Consider:
if (p-->m == 0) f(p);
It certainly looks as if there is a —-> operator and by suitable declaring p and m, you can even get that to compile and run:
int p = 2;
int m = 0;
if (p-->m == 0) f(p);
That means: see if p-- is greater than m (it is), and then compare the result (true) to 0. Well true != 0, so the result is false and f() is not called. In other words:
if ((p--) > m == 0) f(p);
Please don’t waste too much time on such questions. They have been popular for befuddling novices since before C++ was invented.
🥉 The Definitive C++ Book Guide and List
Unfortunately, there is no definite C++ book list. There can’t be one. Not everyone needs the same information, not everyone has the same background, and C++ best practices are evolving. I did a search of the web and found a bewildering set of suggestions. Many were seriously outdated and some were bad from the start. A novice looking for a good book without guidance will be very confused! You do need a book because the techniques that make C++ effective are not easily picked up from a few blogs on specific topics—and of course blogs also suffer from mistakes, being dated, and poor explanations. Often, they also focus on advanced new stuff and ignore the essential fundamentals. I recommend my Programming: Principles and Practice Using C++ (2nd Edition) for people just beginning to learn to program, and A Tour of C++ (2nd Edition) for people who are already programmers and need to know about modern C++. People with a strong mathematical background can start with Peter Gottschling‘s Discovering Modern C++: An Intensive Course for Scientists, Engineers, and Programmers. Once you start using C++ for real, you need a set of guidelines to distinguish what can be done and what is good practice. For that, I recommend the C++ Core Guidelines on GitHub. For good brief explanations of individual language features and standard-library functions, I recommend www.cppreference.com.
#4. What are the differences between a pointer variable and a reference variable in C++?
Both are represented in memory as a machine address. The difference is in their use. To initialize a pointer, you give it the address of an object:
int x = 7;
int* p1 = &x;
int* p2 = new int{9};
To read and write through a pointer, we use the dereference operator (*):
*p1 = 9; // write through p1
int y = *p2; // read through p2
When we assign one pointer to another, they will both point to the same object:
p1 = p2; // now p1 p2 both point to the int with the value 9
*p2 = 99; // write 99 through p2
int z = *p1; // reading through p1, z becomes 99 (not 9)
Note that a pointer can point to different objects during its lifetime. That’s a major difference from references. A reference is bound to an object when it is created and cannot be made to refer to another. For references, dereferencing is implicit. You initialize a reference with an object and the reference takes its address.
int x = 7;
int& r1 = x;
int& r2 = *new int{9};
Operator new returns a pointer, so I had to dereference it before assigning using it to initialize the reference. To read and write through a reference, we just use the reference’s name (no explicit dereferencing):
r1 = 9; // write through r1
int y = r2; // read through r2
When we assign one reference to another, the value referred to will be copied:
r1 = r2; // now p1 and p2 both have the value 9
r1 = 99; // write 99 through r1
int z = r2; // reading through r2, z becomes 9 (not 99)
Both references and pointers are frequently used as function arguments:
void f(int* p)
{
if (p == nullptr) return;
// ...
}
void g(int& r)
{
// ...
}
int x = 7;
f(&x);
g(x);
A pointer can be the nullptr, so we have to consider whether it points to anything. A reference can be assumed to refer to something.
#5. How do I iterate over the words of a string?
Use a stringstream, but how do you define “a word”? Consider “Mary had a little lamb.” Is the last word “lamb” or “lamb.”? If there is no punctuation, it is easy:
vector<string> split(const string& s)
{
stringstream ss(s);
vector<string> words;
for (string w; ss>>w; ) words.push_back(w);
return words;
}
auto words = split("here is a simple example"); // five words
for (auto& w : words) cout << w << '\n';
or just:
for (auto& w : split("here is a simple example")) cout << w << '\n';
By default, the >> operator skips whitespace. If we want arbitrary sets of delimiters, things get a bit more messy:
template<typename Delim>
string get_word(istream& ss, Delim d)
{
string word;
for (char ch; ss.get(ch); ) // skip delimiters
if (!d(ch)) {
word.push_back(ch);
break;
}
for (char ch; ss.get(ch); ) // collect word
if (!d(ch))
word.push_back(ch);
else
break;
return word;
}
The d is an operation telling whether a character is a delimiter and I return "" (the empty string) to indicate that there wasn’t a word to return.
vector<string> split(const string& s, const string& delim)
{
stringstream ss(s);
auto del = [&](char ch) { for (auto x : delim) if (x == ch) return true; return false; };
vector<string> words;
for (string w; (w = get_word(ss, del))!= ""; ) words.push_back(w);
return words;
}
auto words = split("Now! Here is something different; or is it? ", "!.,;? ");
for (auto& w : words) cout << w << '\n';
If you have the C++20 Range library, you don’t have to write something like this yourself but can use a split_view.
Bjarne Stroustrup is a Technical Fellow and Managing Director at Morgan Stanley in New York City and a Visiting Professor at Columbia University. He's also the creator of C++. For more information on C++20: https://isocpp.org.
Done reading this awesome post? We have something fun for ya. The Stack Overflow podcast is back! Come check it out or listen below.