Chenyo's org-static-blog

24 Sep 2024

C++ feature introduction

This is a personal not for the CMU 15-445 C++ bootcamp along with some explanation from Claude.ai.

1. Namespace

  • Provides scopes to identifiers with ::.
  • Namespaces can be nested.

    #include <iostream>
    namespace ABC {
        void spam(int a) { std::cout << "Hello from ABC::spam" << std::endl; }
        // declare a nested namespace
        namespace DEF {
        void bar(float a) { std::cout << "Hello from ABC::DEF::bar" << std::endl; }
        void use_spam(int a) {
        ABC::spam(a);
        // no difference with ABC::spam(a) if DEF
        // does not have a spam function
        spam(a);
        }
    } // namespace DEF
    
        void use_DEF_bar(float a) {
        // if a namespace outside of DEF wants to use DEF::bar
        // it must use the full namespace path ABC::DEF::bar
        DEF::bar(a);
        }
    
    } // namespace ABC
    
  • Two namespaces can define functions with the same name and signatures.
  • Name resolution rules: first check in the current scope, then enclosing scopes, finally going outward until it reaches the global scope.
  • Can use using namespace B to use identifiers in B in the current scope without specifying B::, this is not a good practice.
  • Can also only bring certain members of a namespace into the current scope, e.g., using C::eggs.

2. Wrapper class

  • Used to manage a resource, e.g., memory, file sockets, network connections.

    class IntPtrManager {
    private:
      // manages an int* to access the dynamic memory
      int *ptr_;
    };
    
  • Use the RAII (Resource Acquisition is Initialization) idea: tie the lifetime of a resource to the lifetime of an object.
    • Goal: ensure resources are released even if an exception occurs.
    • Acquisition: resources are acquired in the constructor.

      class IntPtrManager {
      public:
        // the constructor initializes a resource
        IntPtrManager() {
          ptr_ = new int;  // allocate the memory
          *ptr_ = 0;  // set the default value
        }
      
        // the second constructor takes an initial value
        IntPtrManager(int val) {
          ptr_ = new int;
          *ptr_ = val;
        }
      };
      
    • Release: resources are released in the destructor.

      class IntPtrManager {
      public:
        ~IntPtrManager() {
          // ptr_ may be null after the move
          // don't delete a null pointer
          if (ptr_) {
            delete ptr_;
          }
        }
      };
      
  • A wrapper class should not be copyable to avoid double deletion of the same resource in two destructors.

    class IntPtrManager {
    public:
      // delete copy constructor
      IntPtrManager(const IntPtrManager &) = delete;
      // delete copy assignment operator
      IntPtrManager &operator=(const IntPtrManager &) = delete;
    };
    
  • A wrapper class is still moveable from different lvalues/owners.

    class IntPtrManager {
    public:
      // a move constructor
      // called by IntPtrManager b(std::move(a))
      IntPtrManager(IntPtrManager &&other) {
        // while other is a rvalue reference, other.ptr_ is a lvalue
        // therefore a copy happens here, not a move
        // in the constructor this.ptr_ has not pointed to anything
        // so no need to delete ptr_
        ptr = other.ptr_;
        other.ptr_ = nullptr;  // other.ptr_ becomes invalud
      }
    
      // move assignment operator
      // this function is used by c = std::move(b) operation
      // note that calling std::move() does not require implementing this operator
      IntPtrManager &operator=(IntPtrManager &&other) {
        // a self assignment should not delete its ptr_
        if (ptr_ == other.ptr_) {
          return *this;
        }
        if (ptr_) {
          delete ptr_; // release old resource to avoid leak
        }
        ptr_ = other.ptr_;
        other.ptr_ = nullptr;  // invalidate other.ptr_
        return *this;
      }
    };
    

3. Iterator

  • Iterators, e.g., pointers, are objects that point to an element inside a container.

    int *array = malloc(sizeof(int) * 10);
    int *iter = array;
    int zero_elem = *iter;
    // use ++ to iterate through the C style array
    iter++;
    // deference the operator to return the value at the iterator
    int first_elem = *iter;
    
  • Two main components of an iterator:
    • Dereference operator *: return the value of the element of the current iterator position.
    • Increment ++: increment the iterator’s position by 1
      • Postfix iter++: return the iterator before the increment (Iterator).
      • Prefix ++iter: return the result of the increment (Iterator&).
      • ++iter is more efficient.
  • Often used to access and modify elements in C++ STL containers.

3.1. A doubly linked list (DLL) iterator

  • Define a link node:

    struct Node {
      // member initializer list, e.g., next_(nullptr) equals to next_ = nullptr
      Node(int val) : next_(nullptr), prev_(nullptr), value_(val) {}
    
      Node *next_;
      Node *prev_;
      int value_;
    };
    
  • Define the iterator for the DLL:

    class DLLIterator {
    public:
      // takes in a node to mark the start of the iteration
      DLLIterator(Node *head) : curr_(head) {}
    
      // prefix increment operator (++iter)
      DLLIterator &operator++() {
        // must use -> to access the member of a pointer!
        // use . if accessing the object itself
        curr_ = curr_->next_;
        return *this;
      }
    
      // postfix increment operator (iter++)
      // the (int) is a dummy parameter to differentiate
      // the prefix and postfix increment
      DLLIterator operator++(int) {
        DLLIterator temp = *this;
        // this is a pointer to the current object
        // *this returns the iterator object
        // ++*this calls the prefix increment operator,
        // which equals to `this->operator++()`
        ++*this;
        return temp;
      }
    
      // implement the equality operator
      // an lvalue reference argument avoids the copy
      // the const in the parameter means this function
      // cannot modify the argument
      // the `const` outside the parameter list means
      // the function cannot modify `this`
      bool operator==(const DLLIterator &str) const {
        return itr.curr_ == this->curr_;
      }
    
      // implement the dereference operator to return the value
      // at the current iterator position
      int operator*() { return curr_->value_; }
    
    private:
      Node *curr_;
    };
    
  • Define DLL:

    class DLL {
    public:
      DLL() : head_(nullptr), size_(0) {}
    
      // the destructor deletes nodes one by one
      ~DLL() {
        Node *current = head_;
        while (current != nullptr) {
          Node *next = current->next_;
          delete current;
          current = next;
        }
        head_ = nullptr;
      }
    
      // after the insertion `new_node` becomes the new head
      // `head` is just a pointer to the node
      void InsertAtHead(int val) {
        Node *new_node = new Node(val);
        // new_node->next points to the object pointed by head_
        new_node->next_ = head_;
    
        if (head_ != nullptr) {
          head_->prev_ = new_node;
        }
    
        head_ = new_node;
        size_ += 1;
      }
    
      DLLIterator Begin() { return DLLIterator(head_); }
    
      // returns the pointer pointing one after the last element
      // used in the loop to determine whether the iteration ends
      // e.g., `for (DLLIterator iter = dll.Begin(); iter != dll.End(); ++iter)`
      DLLIterator End() { return DLLIterator(nullptr); }
    
      Node *head_{nullptr};  // in-class initializers
      size_t size_;
    };
    

4. STL containers

  • The C++ STL (standard library) is a generic collection of data structure and algorithm implementations, e.g., stacks, queues, hash tables.
  • Each container has own header, e.g., std::vector.
  • The std::set is implemented as a red-black tree.

4.1. Vector

#include <algorithm> // to use std::remove_if
#include <iostream>  // to use std::cout
#include <vector>
// A helper class used for vector
class Point {
public:
  // constructors
  Point() : x_(0), y_(0) {}
  Point(int x, int y) : x_(x), y_(y) {}
  // inline asks the compiler to substitute the function
  // directly at the calling location instead of performing
  // a normal function call, to improve performance for small functions
  inline int GetX() const { return x_; }
  inline void SetX(int x) { x_ = x; }

  void PrintPoint() const {
    std::cout << "Point: (" << x_ << ", " << "y_" << ")\n";
  }

private:
  int x_;
  int y_;
};

int main() {
  std::vector<Point> point_vector;
  // approach 1 to append to a vector
  point_vector.push_back(Point(35, 36));
  // approach 2, pass the argument to Point(x,y) constructor
  // slightly faster than push_back
  point_vector.emplace_back(37, 38);
  // iterate through index
  // size_t: unsigned integers specifially used in loop or counting
  for (size_t i = 0; i < point_vector.size(); ++i) {
    point_vector[i].PrintPoint();
  }
  // iterate through mutable reference
  for (Point &item : point_vector) {
    item.SetX(10);
  }
  // iterate through immutable reference
  for (const Point &item : point_vector) {
    item.GetX();
  }

  // initialize the vector with an initializer list
  std::vector<int> int_vector = {0, 1, 2, 3};
  // erase element given its index
  // int_vector.begin() returns a std::vector<int>::iterator
  // pointing to the first elemnt in the vector
  // the vector iterator has a plus iterator
  int_vector.erase(int_vector.begin() + 2); // {0, 1, 3}
  // erase a range of elements
  // int_vector.end() points to the end of a vector (not the last element)
  // and cannot be accessed.
  int_vector.erase(int_vector.begin() + 1, int_vector.end()); // {0}
  // erase elements via filtering
  // std::remove_if(range_begin, range_end, condition) returns an iterator
  // pointing to the first element to be erased
  // remove_if() also partitions point_vector so that unsatisfied elements are
  // moved before the point_vector.begin(), i.e., the vector is reordered
  point_vector.erase(
      std::remove_if(point_vector.begin(), point_vector.end(),
                     [](const Point &point) { return point.GetX() == 10; }),
      point_vector.end());

  return 0;
}

4.2. Set

#include <iostream>
#include <set>

int main() {
  std::set<int> int_set;
  // can insert element with .insert() or .emplace()
  // .emplace() allows to construct the object in place
  for (int i = 1; i <= 5; ++i) {
    int_set.insert(i);
  }
  for (int i = 6; i <= 10; ++i) {
    int_set.emplace(i);
  }
  // iterate the set
  for (std::set<int>::iterator it = int_set.begin(); it != int_set.end();
       ++it) {
    std::cout << *it << " ";
  }
  std::cout << "\n";
  // .find(key) returns an iterator pointing to the key
  // if it is in the set, otherwise returns .end()
  std::set<int>::iterator search = int_set.find(2);
  if (search != int_set.end()) {
    std::cout << "2 is not found\n";
  }
  // check whether the set contains a key with .count()
  // it returns either 0 or 1 as each key is unique
  if (int_set.count(11) == 0) {
    std::cout << "11 is not in the set.\n";
  }
  // erase a key, returns the count of removed elements 0 or 1
  int_set.erase(4);
  // erase a key given its position, returns the iterator to the next element
  int_set.erase(int_set.begin());
  // erase a range of elements
  int_set.erase(int_set.find(9), int_set.end());
}

4.3. Unordered maps

#include <iostream>
#include <string>
#include <unordered_map>
#include <utility> // to use std::make_pair

int main() {
  std::unordered_map<std::string, int> map;
  // insert items
  map.insert({"foo", 2});
  map.insert({{"bar", 1}, {"eggs", 2}});
  // insert items via pairs
  map.insert(std::make_pair("hello", 10));
  // insert items in an array-style
  map["world"] = 3;
  // update the value
  map["foo"] = 9;
  // .find() returns an iterator pointing to the item
  // or the end
  std::unordered_map<std::string, int>::iterator result = map.find("bar");
  if (result != map.end()) {
    // one way to access the item
    // both '\n' and std::endl prints newliine, but std::endl
    // also flushes the output buffer, so use '\n' is better
    std::cout << "key: " << result->first << " value: " << result->second
              << std::endl;
    // another way is dereferencing
    std::pair<std::string, int> pair = *result;
    std::cout << "key: " << pair.first << " value: " << pair.second
              << std::endl;
    // check whether a key exists with .count()
    if (map.count("foo") == 0) {
      std::cout << "foo does not exist\n";
    }
    // erase an item via a key
    map.erase("world");
    // or via an iterator
    map.erase(map.find("bar"));
    // can iterate the map via iterator or via for-each
    for (std::unordered_map<std::string, int>::iterator it = map.begin();
         it != map.end(); ++it) {
      std::cout << "(" << it->first << ", " << it->second << "), ";
    }
    std::cout << "\n";
  }
}

5. auto

  • auto keyword tells the compiler to infer the type via its initialization expression.
#include <vector>
#include<unordered_map>
#include<string>
#include<iostream>
// create very long class and function
template <typename T, typename U> class VeryLongTemplateClass {
public:
  VeryLongTemplateClass(T instance1, U instance2)
      : instance1_(instance1), instance2_(instance2) {}

private:
  T instance1_;
  U instance2_;
};

template <typename T> VeryLongTemplateClass<T, T> construct_obj(T instance) {
  return VeryLongTemplateClass<T, T>(instance, instance);
}

int main() {
  auto a = 1;                        // a is int
  auto obj1 = construct_obj<int>(2); // can infer
  // auto defaults to copy objects rather than taking the reference
  std::vector<int> int_values = {1, 2, 3, 4};
  // a deep-copy happens
  auto copy_int_values = int_values;
  // this creates a reference
  auto &ref_int_values = int_values;
  // use auto in the for loop is very common
  std::unordered_map<std::string, int> map;
  for (auto it = map.begin(); it != map.end(); ++it) {
  }
  // another exmaple
  std::vector<int> vec = {1, 2, 3, 4};
  for (const auto &elem : vec) {
    std::cout << elem << " ";
  }
  std::cout << std::endl;
}

6. Smart pointers

  • A smart pointer is a data structure used in languages that do not have built-in memory management (e.g., with garbage collection, e.g., Python, Java) to handle memory allocation and deallocation.
  • std::unique_ptr and std::shared_ptr are two C++ standard library smart pointers, they are wrapper classes over raw pointers.
  • std::unique_ptr retains sole ownership of an object, i.e., no two instances of unique_ptr can manage the same object, a unique_ptr cannot be copied.
  • std::shared_ptr retains shared ownership of an object, i.e., multiple shared pointers can own the same object and can be copied.

6.1. std::unique_ptr

#include <iostream>
#include <memory>  // to use std::unique_ptr
#include <utility> // to use std::move

// class Point is the same as before

// takes in a unique point reference and changes its x value
void SetXTo10(std::unique_ptr<Point> &ptr) { ptr->SetX(10); }
int main() {
  // initialize an empty unique pointer
  std::unique_ptr<Point> u1;
  // initialize a unique pointer with constructors
  std::unique_ptr<Point> u2 = std::make_unique<Point>();
  std::unique_ptr<Point> u3 = std::make_unique<Point>(2, 3);
  // can treat a unique pointer as a boolean to determine whether
  // the pointer contains data
  std::cout << "Pointer u1 is " << (u1 ? "not empty" : "empty") << std::endl;
  if (u2) {
    std::cout << u2->GetX() << std::endl;
  }
  // unique_ptr has no copy constructor!
  std::unique_ptr<Point> u4 = u3; // won't compile!
  // can transfer ownership with std::move
  std::unique_ptr<Point> u5 = std::move(u3); // then u3 becomes empty!
  // pass the pointer as a reference so the ownership does not change
  SetXTo10(u5); // can still access u5 afterwards
  return 0;
}
  • Note that the compiler does not prevent 2 unique pointers from pointing to the same object.
// the following code can compile
// but it causes a double deletion!
MyClass *obj = new MyClass();
std::unique_ptr<MyClass> ptr1(obj);
std::unique_ptr<MyClass> ptr2(obj);

6.2. std::shared_ptr

#include <iostream>
#include <memory>
#include <utility>

// class Point is the same as before

// modify a Point inside a shared_ptr by passing the reference
void modify_ptr_via_ref(std::shared_ptr<Point> &point) { point->SetX(10); }
// modify a Point inside a shared_ptr by passing the rvalue reference
// If an object is passed by rvalue reference, one should assume the ownership
// is moved after the function call, so the original object cannot be used
void modify_ptr_via_rvalue_ref(std::shared_ptr<Point> &&point) {
  point->SetY(11);
}
// copy a shared_pointer with the default constructor
// so one should define own copy constructor and assignment when an object
// contains pointers
void copy_shard_ptr_in_function(std::shared_ptr<Point> point) {
  std::cout << "Use count of the shared pointer: " << point.use_count()
            << std::endl; // add by 1
  // the copy is destroyed at the end, so the count is decremented
}

int main() {
  // the pointer constructors are the same as unique_ptr
  std::shared_ptr<Point> s1;
  std::shared_ptr<Point> s2 = std::make_shared<Point>();
  std::shared_ptr<Point> s3 = std::make_shared<Point>(2, 3);
  // copy a shared pointer via copy assignment or copy constructor
  // increment the reference count, which can be tracked by .use_count
  std::cout << "Use count of s3" << s3.use_count() << std::endl; // 1
  std::shared_ptr<Point> s4 = s3; // copy assignment
  std::cout << "Use count of s3" << s3.use_count() << std::endl; // 2
  std::shared_ptr<Point> s5(s4);
  std::cout << "Use count of s3" << s3.use_count() << std::endl; // 3
  s3->SetX(100); // also changes the data in s4 and s5
  std::shared_ptr<Point> s6 = std::move(s5); // s5 transfer the ownership to s6
  std::cout << "Use count of s3" << s3.use_count()
            << std::endl; // still 3: s3, s4, s6
  // as unique_ptr, shared_ptr can be passed by references and rvalue references
  modify_ptr_via_ref(s2);                   // setX(11)
  modify_ptr_via_rvalue_ref(std::move(s2)); // setY(12)
  std::cout << "s2.x = " << s2->GetX()
            << " , s2.y = " << s2->GetY(); // (11, 12)
  // shared_ptr can also be passed by value/copy
  std::cout << "Use count of s2" << s2.use_count() << std::endl; // 1
  copy_shared_ptr_in_function(s2); // inside the function, the use count is 2
  std::cout << "Use count of s2" << s2.use_count()
            << std::endl; // 1 again as the copy is detroyed
  return 0;
}

7. Synchronization

7.1. std::mutex

#include <iostream>
#include <mutex>
#include <thread>

// define a global variable to be modified
// by multiple threads
int count = 0;
// declare a mutex
std::mutex m;
void add_count() {
  m.lock(); // acquire the lock
  count += 1;
  m.unlock(); // release the lock
}
int main() {
  std::thread t1(add_count);
  std::thread t2(add_count);
  t1.join();
  t2.join();
  std::cout << count << std::endl; // always 2
  return 0;
}

7.2. std::scoped_lock

  • A mutex wrapper that provides a RAII-style of obtaining and releasing the lock.
  • When the object is constructed, the locks are acquired; when the object is destructured, the locks are released.
  • Better than std::mutex since it provides exception safety.
#include <iostream>
#include <mutex>

int count = 0;
std::mutex m;
void add_count() {
  // the scoped_lock constructor allows for the thread
  // to acquire the mutex m
  std::scoped_lock slk(m);
  count += 1;
  // once the function finishes, slk is destructurd and
  // m is released
}

7.3. std::condition_variable

  • Allow threads to wait until a particular condition before acquiring a mutex.
  • Allow other threads to signal waiting threads to alert them that the condition may be true. notify_one
#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>

int count = 0;
std::mutex m;
// declare a condition variable
std::condition_variable cv;
void add_count_and_notify() {
  std::scoped_lock slk(m);
  count += 1;
  if (count == 2) {
    // notidy one waiting thread
    // otherwise the waiter thread hangs forever
    cv.notify_one();
  }
}
void waiter_thread() {
  // a unique_lock is movable but not copiable
  // more flexible than scoped_lock as it can unlock
  // and relock manually, while scoped_lock can only be
  // unlocked automatically.
  // scoped_lock cannot be used with condition variables
  std::unique_lock lk(m);
  cv.wait(lk, [] { return count == 2; });
  std::cout << count << std::endl;
}
int main() {
  std::thread t1(waiter_thread);
  // make t1 really waits
  std::this_thread::sleep_for(std::chrono::milliseconds(100));
  std::thread t2(add_count_and_notify);
  std::thread t3(add_count_and_notify);
  t1.join();
  t2.join();
  t3.join();
  return 0;
}

7.4. Reader-writer lock

  • A reader-writer lock allows multiple threads to have shared read access (no writers are allowed during read operations) and exclusive write access, i.e., no other readers or writers are allowed during the write.
  • C++ does not have a specific reader-writer’s lock library, but can be emulated with std::shared_mutex, std::shared_lock and std::unique_lock.
  • std::shared_mutex is a mutex that allows for both shared read-only locking and exclusive write-only locking.
  • std::shared_lock can be used as an RAII-style read lock.
  • std::unique_lock can be used a RAII-style exclusive write lock.

    #include <iostream>
    #include <mutex>
    #include <shared_mutex>
    #include <thread>
    
    int count = 0; // resource
    std::shared_mutex m;
    void read_value() {
      // use shared_lock to gain read-only, shared access
      std::shared_lock lk(m);
      std::cout << "Reading " + std::to_string(count) << std::endl;
    }
    
    void write_value() {
      // use unique_lock to gain exclusive access
      std::unique_lock lk(m);
      count += 3;
    }
    int main() {
      // since all 3 threads run in parallel,
      // the output is not deterministic
      std::thread t1(read_value);
      std::thread t2(write_value);
      std::thread t3(read_value);
      t1.join();
      t2.join();
      t3.join();
      return 0;
    }
    
Tags: c++ study cmu