Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Add functionality for tagged DFA. #62

Closed
wants to merge 476 commits into from

Conversation

SharafMohamed
Copy link
Contributor

@SharafMohamed SharafMohamed commented Dec 5, 2024

References

  • Depends on PR#71.
  • To review in parallel with PR#71, diff against PR#71 locally. In the repo run:
git fetch upstream pull/71/head:pr-71
git fetch upstream pull/62/head:pr-62
git diff pr-71 pr-62

Description

Validation performed

Previously existing tests succeed.

Summary by CodeRabbit

Release Notes

Overview

This release introduces significant refactoring and type improvements across the log_surgeon library, focusing on enhancing code clarity and consistency.

Changes

  • Type Renaming

    • Renamed finite automata state types from RegexNFAByteState to ByteNfaState
    • Updated DFA and NFA type names for improved readability
    • Standardized template parameter naming conventions
  • Code Modernization

    • Introduced auto return types with explicit trailing return type syntax
    • Simplified method signatures
    • Removed unnecessary code complexity
  • New Features

    • Added new Register and RegisterOperator classes in finite automata module
    • Enhanced Dfa class with improved state management
  • Performance

    • Improved memory management using modern C++ practices
    • Streamlined type handling across multiple components

The changes primarily focus on internal code structure and do not introduce significant user-facing modifications.

SharafMohamed and others added 30 commits November 13, 2024 16:12
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
src/log_surgeon/finite_automata/DfaState.hpp (1)

60-64: ⚠️ Potential issue

Add bounds checking to prevent out-of-bounds access in next method

In the next method, when state_type is StateType::Byte, accessing m_bytes_transition[character] without verifying that character is within bounds can lead to out-of-bounds access if character is greater than or equal to cSizeOfByte. Adding a bounds check ensures safe access.

Apply this diff to add bounds checking:

 if constexpr (StateType::Byte == state_type) {
+    if (character >= cSizeOfByte) {
+        return nullptr;
+    }
     return m_bytes_transition[character];
 } else {
🧹 Nitpick comments (2)
src/log_surgeon/finite_automata/Register.hpp (1)

16-58: Consider adding move semantics and memory management

The class design could benefit from several improvements:

  1. Define copy/move constructors and assignment operators
  2. Add destructor or use smart pointers for proper cleanup
  3. Consider reserving space in m_positions vector if typical size is known

Example implementation:

class Register {
public:
    // ... existing code ...

    // Add move semantics
    Register(Register&&) noexcept = default;
    Register& operator=(Register&&) noexcept = default;

    // Prevent copying if tag ownership is exclusive
    Register(Register const&) = delete;
    Register& operator=(Register const&) = delete;

    // Optional: Reserve space if typical size is known
    void reserve_positions(size_t size) {
        m_positions.reserve(size);
    }
};
src/log_surgeon/finite_automata/DfaState.hpp (1)

41-41: Consider passing uint8_t by value

In add_byte_transition, passing uint8_t const& byte by const reference may be less efficient than passing by value, since uint8_t is a small type. Consider passing byte by value.

Apply this diff:

-auto add_byte_transition(uint8_t const& byte, DfaState* dest_state) -> void {
+auto add_byte_transition(uint8_t byte, DfaState* dest_state) -> void {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 538ae8b and 139b097.

📒 Files selected for processing (10)
  • examples/intersect-test.cpp (4 hunks)
  • src/log_surgeon/Lalr1Parser.hpp (10 hunks)
  • src/log_surgeon/Lexer.tpp (16 hunks)
  • src/log_surgeon/finite_automata/Dfa.hpp (1 hunks)
  • src/log_surgeon/finite_automata/DfaState.hpp (4 hunks)
  • src/log_surgeon/finite_automata/DfaStatePair.hpp (5 hunks)
  • src/log_surgeon/finite_automata/NfaState.hpp (1 hunks)
  • src/log_surgeon/finite_automata/Register.hpp (1 hunks)
  • src/log_surgeon/finite_automata/StateType.hpp (1 hunks)
  • src/log_surgeon/finite_automata/TaggedTransition.hpp (5 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/log_surgeon/finite_automata/StateType.hpp
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/log_surgeon/finite_automata/Dfa.hpp
  • src/log_surgeon/Lalr1Parser.hpp
  • src/log_surgeon/finite_automata/DfaStatePair.hpp
  • examples/intersect-test.cpp
  • src/log_surgeon/finite_automata/TaggedTransition.hpp
  • src/log_surgeon/Lexer.tpp
🧰 Additional context used
📓 Path-based instructions (3)
src/log_surgeon/finite_automata/DfaState.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/Register.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/NfaState.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

📓 Learnings (3)
src/log_surgeon/finite_automata/DfaState.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/Register.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/RegisterHandler.hpp:0-0
Timestamp: 2024-11-27T22:25:35.608Z
Learning: In the `RegisterHandler` class in `src/log_surgeon/finite_automata/RegisterHandler.hpp`, the methods `add_register` and `append_position` rely on `emplace_back` and `m_prefix_tree.insert` to handle exceptions correctly and maintain consistent state without requiring additional exception handling.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🔇 Additional comments (3)
src/log_surgeon/finite_automata/Register.hpp (1)

44-57: ⚠️ Potential issue

Address unused member and add null pointer validation

The RegisterOperator class has several issues:

  1. m_operation member is never used
  2. No null pointer validation for m_register
  3. Raw pointer ownership semantics aren't clear

Apply this diff to add null pointer validation:

     [[nodiscard]] auto get_register() const -> Register* { return m_register; }

-    [[nodiscard]] auto is_start() const -> bool { return m_register->is_start(); }
+    [[nodiscard]] auto is_start() const -> bool {
+        if (nullptr == m_register) {
+            throw std::runtime_error("Register pointer is null");
+        }
+        return m_register->is_start();
+    }

The m_operation member appears unused. Let's verify its usage across the codebase:

src/log_surgeon/finite_automata/DfaState.hpp (1)

27-27: Initialisation of m_bytes_transition array prevents undefined behaviour

Good job on initialising the m_bytes_transition array in the constructor. This ensures all pointers are set to nullptr, preventing undefined behaviour due to uninitialised pointers.

src/log_surgeon/finite_automata/NfaState.hpp (1)

272-273: Reminder to address the TODO comment for handling negative tagged transitions

The TODO comment indicates that handling of multiple negative tags in the epsilon_closure method is incomplete. This could affect the correctness of the NFA implementation. Please address this TODO to ensure proper handling of negative tagged transitions.

Would you like me to help implement the necessary changes or open a GitHub issue to track this task?

Comment on lines +1 to +15
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER

#include <cstdint>

#include <log_surgeon/finite_automata/Tag.hpp>
enum class RegisterOperation {
Assign,
Append,
Clear,
None,
Pop,
};

namespace log_surgeon::finite_automata {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing include and move enum into namespace

Two issues need attention:

  1. The <vector> header is required for std::vector usage
  2. The RegisterOperation enum should be within the log_surgeon::finite_automata namespace for consistency

Apply this diff to fix the issues:

 #ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER
 #define LOG_SURGEON_FINITE_AUTOMATA_REGISTER
 
 #include <cstdint>
+#include <vector>
 
 #include <log_surgeon/finite_automata/Tag.hpp>
+
+namespace log_surgeon::finite_automata {
 enum class RegisterOperation {
     Assign,
     Append,
     Clear,
     None,
     Pop,
 };
 
-namespace log_surgeon::finite_automata {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#include <cstdint>
#include <log_surgeon/finite_automata/Tag.hpp>
enum class RegisterOperation {
Assign,
Append,
Clear,
None,
Pop,
};
namespace log_surgeon::finite_automata {
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#include <cstdint>
#include <vector>
#include <log_surgeon/finite_automata/Tag.hpp>
namespace log_surgeon::finite_automata {
enum class RegisterOperation {
Assign,
Append,
Clear,
None,
Pop,
};

Comment on lines +16 to +42
class Register {
public:
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {}

auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); }

auto update_last_position(uint32_t const pos) -> void { m_positions.back() = pos; }

auto negate_last_position() -> void { m_positions.pop_back(); }

auto negate_all_positions() -> void { m_positions.clear(); }

[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; }

[[nodiscard]] auto is_start() const -> bool { return m_is_start; }

[[nodiscard]] auto get_last_position() const -> uint32_t { return m_positions.back(); }

[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& {
return m_positions;
}

private:
Tag* m_tag;
bool m_is_start;
std::vector<uint32_t> m_positions;
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Add bounds checking and clarify ownership semantics

Several concerns in the Register class implementation:

  1. Methods accessing m_positions don't check for empty vector
  2. Raw pointer ownership semantics aren't clear
  3. Public methods lack documentation

Apply this diff to add bounds checking:

     auto update_last_position(uint32_t const pos) -> void {
+        if (m_positions.empty()) {
+            throw std::out_of_range("Cannot update position in empty register");
+        }
         m_positions.back() = pos;
     }

     auto negate_last_position() -> void {
+        if (m_positions.empty()) {
+            throw std::out_of_range("Cannot negate position in empty register");
+        }
         m_positions.pop_back();
     }

     [[nodiscard]] auto get_last_position() const -> uint32_t {
+        if (m_positions.empty()) {
+            throw std::out_of_range("Cannot get position from empty register");
+        }
         return m_positions.back();
     }

Consider these improvements:

  1. Use std::unique_ptr or document ownership semantics for m_tag
  2. Add method documentation explaining preconditions and exceptions
  3. Consider adding a method to check if positions exist before access
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class Register {
public:
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {}
auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); }
auto update_last_position(uint32_t const pos) -> void { m_positions.back() = pos; }
auto negate_last_position() -> void { m_positions.pop_back(); }
auto negate_all_positions() -> void { m_positions.clear(); }
[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; }
[[nodiscard]] auto is_start() const -> bool { return m_is_start; }
[[nodiscard]] auto get_last_position() const -> uint32_t { return m_positions.back(); }
[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& {
return m_positions;
}
private:
Tag* m_tag;
bool m_is_start;
std::vector<uint32_t> m_positions;
};
class Register {
public:
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {}
auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); }
auto update_last_position(uint32_t const pos) -> void {
if (m_positions.empty()) {
throw std::out_of_range("Cannot update position in empty register");
}
m_positions.back() = pos;
}
auto negate_last_position() -> void {
if (m_positions.empty()) {
throw std::out_of_range("Cannot negate position in empty register");
}
m_positions.pop_back();
}
auto negate_all_positions() -> void { m_positions.clear(); }
[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; }
[[nodiscard]] auto is_start() const -> bool { return m_is_start; }
[[nodiscard]] auto get_last_position() const -> uint32_t {
if (m_positions.empty()) {
throw std::out_of_range("Cannot get position from empty register");
}
return m_positions.back();
}
[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& {
return m_positions;
}
private:
Tag* m_tag;
bool m_is_start;
std::vector<uint32_t> m_positions;
};

Comment on lines +236 to +241
if (current_register->get_tag() == transition_tag) {
unvisited_stack.emplace(
RegisterOperator{current_register, RegisterOperation::Append},
dest_state
);
} else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Possible null pointer dereference of current_register

In the condition if (current_register->get_tag() == transition_tag), current_register could be nullptr, which may lead to a null pointer dereference. Please ensure current_register is not nullptr before dereferencing.

Apply this diff to check for nullptr before dereferencing:

-            if (current_register->get_tag() == transition_tag) {
+            if (current_register != nullptr && current_register->get_tag() == transition_tag) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (current_register->get_tag() == transition_tag) {
unvisited_stack.emplace(
RegisterOperator{current_register, RegisterOperation::Append},
dest_state
);
} else {
if (current_register != nullptr && current_register->get_tag() == transition_tag) {
unvisited_stack.emplace(
RegisterOperator{current_register, RegisterOperation::Append},
dest_state
);
} else {

Comment on lines 149 to 151
[[nodiscard]] auto get_state() const -> NfaStateTypeTypedDfaState* { return m_nfa_state; }

bool operator<(RegOpNfaStatePair const& other) const { return m_nfa_state < other.m_nfa_state; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Typo in return type of get_state() method

The return type NfaStateTypeTypedDfaState* appears incorrect. It should likely be TypedDfaState*, matching the template parameter.

Apply this diff to fix the typo:

-        [[nodiscard]] auto get_state() const -> NfaStateTypeTypedDfaState* { return m_nfa_state; }
+        [[nodiscard]] auto get_state() const -> TypedDfaState* { return m_nfa_state; }

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
CMakeLists.txt (1)

Line range hint 1-1: Update CMake minimum version format.

The CMake version format is deprecated. Update to use the new format to address the pipeline warning.

Apply this diff:

-cmake_minimum_required(VERSION 3.5.1)
+cmake_minimum_required(VERSION 3.5...3.27)
♻️ Duplicate comments (1)
src/log_surgeon/finite_automata/DfaState.hpp (1)

60-63: ⚠️ Potential issue

Add bounds checking to prevent out-of-bounds access in next method

In the next method, when state_type is StateType::Byte, accessing m_bytes_transition[character] without verifying that character is within bounds can lead to out-of-bounds access if character is greater than or equal to cSizeOfByte. Adding a bounds check ensures safe access.

Apply this diff to add bounds checking:

    if constexpr (StateType::Byte == state_type) {
+       if (character >= cSizeOfByte) {
+           return nullptr;
+       }
        return m_bytes_transition[character];
    } else {
🧹 Nitpick comments (3)
src/log_surgeon/finite_automata/NfaState.hpp (2)

213-213: Use more idiomatic condition.

The condition false == unvisited_stack.empty() is less readable than !unvisited_stack.empty().

Apply this diff:

-    while (false == unvisited_stack.empty()) {
+    while (!unvisited_stack.empty()) {

322-322: Use more idiomatic conditions.

Similar readability improvements for the serialization conditions.

Apply this diff:

-        if (false == optional_serialized_positive_start_transition.has_value()) {
+        if (!optional_serialized_positive_start_transition.has_value()) {
-        if (false == optional_serialized_positive_end_transition.has_value()) {
+        if (!optional_serialized_positive_end_transition.has_value()) {
-        if (false == optional_serialized_negative_transition.has_value()) {
+        if (!optional_serialized_negative_transition.has_value()) {

Also applies to: 334-334, 345-345

tests/CMakeLists.txt (1)

Line range hint 1-33: Address CMake deprecation warning

The pipeline shows a CMake deprecation warning about compatibility with versions < 3.10.

Consider adding a minimum CMake version requirement:

+cmake_minimum_required(VERSION 3.10)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 139b097 and 49474bb.

📒 Files selected for processing (6)
  • CMakeLists.txt (2 hunks)
  • src/log_surgeon/finite_automata/DfaState.hpp (4 hunks)
  • src/log_surgeon/finite_automata/NfaState.hpp (1 hunks)
  • src/log_surgeon/finite_automata/StateType.hpp (1 hunks)
  • tests/CMakeLists.txt (2 hunks)
  • tests/test-nfa.cpp (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/log_surgeon/finite_automata/StateType.hpp
🧰 Additional context used
📓 Path-based instructions (3)
src/log_surgeon/finite_automata/NfaState.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

tests/test-nfa.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/DfaState.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

📓 Learnings (4)
CMakeLists.txt (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
tests/test-nfa.cpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/DfaState.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🪛 GitHub Actions: Build
tests/CMakeLists.txt

[warning] 1-1: CMake Deprecation Warning: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument or use ... syntax.

CMakeLists.txt

[warning] 1-1: CMake Deprecation Warning: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument or use ... syntax.

🔇 Additional comments (11)
tests/test-nfa.cpp (1)

10-31: Type aliases and includes updated correctly

The type aliases and includes have been updated appropriately to reflect the refactoring from RegexNFA to Nfa. This ensures consistency and correctness in the test code.

CMakeLists.txt (3)

71-73: LGTM! Consistent naming convention applied.

The renaming of LALR1Parser files to Lalr1Parser follows a more consistent naming convention.


99-103: LGTM! Improved modularity with new finite automata components.

The addition of separate files for DFA and NFA components improves modularity and separation of concerns.


105-105: LGTM! StateType abstraction added.

The addition of StateType.hpp provides a clean abstraction for state types.

src/log_surgeon/finite_automata/NfaState.hpp (5)

21-30: LGTM! Well-structured type definitions.

The template class and type aliases provide a clean abstraction for different state types.


38-42: LGTM! Constructor implementation aligns with learnings.

The constructors correctly use std::vector for tags as per the retrieved learning from PR#48.


272-296: Address TODO comment for negative tagged transitions.

The commented-out code and TODO indicate incomplete implementation of negative tagged transitions.

Would you like me to help implement the negative tagged transitions handling or create a GitHub issue to track this task?


236-236: ⚠️ Potential issue

Fix potential null pointer dereference.

The condition current_register->get_tag() == transition_tag may dereference a null pointer.

Apply this diff:

-            if (current_register->get_tag() == transition_tag) {
+            if (current_register != nullptr && current_register->get_tag() == transition_tag) {

Likely invalid or redundant comment.


258-258: ⚠️ Potential issue

Fix potential null pointer dereference.

Similar null pointer dereference issue in the positive tagged end transition handling.

Apply this diff:

-            if (current_register->get_tag() == transition_tag) {
+            if (current_register != nullptr && current_register->get_tag() == transition_tag) {

Likely invalid or redundant comment.

tests/CMakeLists.txt (2)

27-27: LGTM: Consistent test file naming

The renaming of test-NFA.cpp to test-nfa.cpp follows the lowercase naming convention, which aligns well with the class renaming changes.


14-16: Verify the inclusion of LALR(1) parser files

The addition of LALR(1) parser files (Lalr1Parser.cpp, .hpp, .tpp) appears unrelated to the PR's objective of moving epsilon_closure and nfa_to_dfa. Please clarify if these changes are necessary for this refactoring.

✅ Verification successful

LALR(1) parser files are correctly included

The LALR(1) parser is a core component that directly uses the NFA/DFA system through template parameters and inheritance. Its inclusion is necessary to maintain compatibility with the refactored finite automata components.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the LALR(1) parser files are referenced by the modified NFA/DFA classes
rg -l "Lalr1Parser" "../src/log_surgeon/finite_automata/"

Length of output: 131


Script:

#!/bin/bash
# Find the correct paths
fd -t f "finite.*automata" src/
fd -t f "Lalr1Parser" src/

# Check for cross-references
rg "Lalr1Parser" --type cpp
rg "Nfa|Dfa" -g "*Lalr1Parser*" --type cpp

Length of output: 7833

@SharafMohamed SharafMohamed changed the title refactor: Move epsilon_closure and nfa_to_dfa from Lexer class to Dfa class. refactor: Add functionality for tagged DFA. Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant