-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Add functionality for tagged DFA. #62
Conversation
…TCapture; Use cbegin() and cend().
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
Co-authored-by: Lin Zhihao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
src/log_surgeon/finite_automata/DfaState.hpp (1)
60-64
:⚠️ Potential issueAdd bounds checking to prevent out-of-bounds access in
next
methodIn the
next
method, whenstate_type
isStateType::Byte
, accessingm_bytes_transition[character]
without verifying thatcharacter
is within bounds can lead to out-of-bounds access ifcharacter
is greater than or equal tocSizeOfByte
. Adding a bounds check ensures safe access.Apply this diff to add bounds checking:
if constexpr (StateType::Byte == state_type) { + if (character >= cSizeOfByte) { + return nullptr; + } return m_bytes_transition[character]; } else {
🧹 Nitpick comments (2)
src/log_surgeon/finite_automata/Register.hpp (1)
16-58
: Consider adding move semantics and memory managementThe class design could benefit from several improvements:
- Define copy/move constructors and assignment operators
- Add destructor or use smart pointers for proper cleanup
- Consider reserving space in
m_positions
vector if typical size is knownExample implementation:
class Register { public: // ... existing code ... // Add move semantics Register(Register&&) noexcept = default; Register& operator=(Register&&) noexcept = default; // Prevent copying if tag ownership is exclusive Register(Register const&) = delete; Register& operator=(Register const&) = delete; // Optional: Reserve space if typical size is known void reserve_positions(size_t size) { m_positions.reserve(size); } };src/log_surgeon/finite_automata/DfaState.hpp (1)
41-41
: Consider passinguint8_t
by valueIn
add_byte_transition
, passinguint8_t const& byte
by const reference may be less efficient than passing by value, sinceuint8_t
is a small type. Consider passingbyte
by value.Apply this diff:
-auto add_byte_transition(uint8_t const& byte, DfaState* dest_state) -> void { +auto add_byte_transition(uint8_t byte, DfaState* dest_state) -> void {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
examples/intersect-test.cpp
(4 hunks)src/log_surgeon/Lalr1Parser.hpp
(10 hunks)src/log_surgeon/Lexer.tpp
(16 hunks)src/log_surgeon/finite_automata/Dfa.hpp
(1 hunks)src/log_surgeon/finite_automata/DfaState.hpp
(4 hunks)src/log_surgeon/finite_automata/DfaStatePair.hpp
(5 hunks)src/log_surgeon/finite_automata/NfaState.hpp
(1 hunks)src/log_surgeon/finite_automata/Register.hpp
(1 hunks)src/log_surgeon/finite_automata/StateType.hpp
(1 hunks)src/log_surgeon/finite_automata/TaggedTransition.hpp
(5 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/log_surgeon/finite_automata/StateType.hpp
🚧 Files skipped from review as they are similar to previous changes (6)
- src/log_surgeon/finite_automata/Dfa.hpp
- src/log_surgeon/Lalr1Parser.hpp
- src/log_surgeon/finite_automata/DfaStatePair.hpp
- examples/intersect-test.cpp
- src/log_surgeon/finite_automata/TaggedTransition.hpp
- src/log_surgeon/Lexer.tpp
🧰 Additional context used
📓 Path-based instructions (3)
src/log_surgeon/finite_automata/DfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/Register.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (3)
src/log_surgeon/finite_automata/DfaState.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/Register.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/RegisterHandler.hpp:0-0
Timestamp: 2024-11-27T22:25:35.608Z
Learning: In the `RegisterHandler` class in `src/log_surgeon/finite_automata/RegisterHandler.hpp`, the methods `add_register` and `append_position` rely on `emplace_back` and `m_prefix_tree.insert` to handle exceptions correctly and maintain consistent state without requiring additional exception handling.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🔇 Additional comments (3)
src/log_surgeon/finite_automata/Register.hpp (1)
44-57
:⚠️ Potential issueAddress unused member and add null pointer validation
The RegisterOperator class has several issues:
m_operation
member is never used- No null pointer validation for
m_register
- Raw pointer ownership semantics aren't clear
Apply this diff to add null pointer validation:
[[nodiscard]] auto get_register() const -> Register* { return m_register; } - [[nodiscard]] auto is_start() const -> bool { return m_register->is_start(); } + [[nodiscard]] auto is_start() const -> bool { + if (nullptr == m_register) { + throw std::runtime_error("Register pointer is null"); + } + return m_register->is_start(); + }The
m_operation
member appears unused. Let's verify its usage across the codebase:src/log_surgeon/finite_automata/DfaState.hpp (1)
27-27
: Initialisation ofm_bytes_transition
array prevents undefined behaviourGood job on initialising the
m_bytes_transition
array in the constructor. This ensures all pointers are set tonullptr
, preventing undefined behaviour due to uninitialised pointers.src/log_surgeon/finite_automata/NfaState.hpp (1)
272-273
: Reminder to address the TODO comment for handling negative tagged transitionsThe TODO comment indicates that handling of multiple negative tags in the
epsilon_closure
method is incomplete. This could affect the correctness of the NFA implementation. Please address this TODO to ensure proper handling of negative tagged transitions.Would you like me to help implement the necessary changes or open a GitHub issue to track this task?
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER | ||
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER | ||
|
||
#include <cstdint> | ||
|
||
#include <log_surgeon/finite_automata/Tag.hpp> | ||
enum class RegisterOperation { | ||
Assign, | ||
Append, | ||
Clear, | ||
None, | ||
Pop, | ||
}; | ||
|
||
namespace log_surgeon::finite_automata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing include and move enum into namespace
Two issues need attention:
- The
<vector>
header is required forstd::vector
usage - The
RegisterOperation
enum should be within thelog_surgeon::finite_automata
namespace for consistency
Apply this diff to fix the issues:
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER
#include <cstdint>
+#include <vector>
#include <log_surgeon/finite_automata/Tag.hpp>
+
+namespace log_surgeon::finite_automata {
enum class RegisterOperation {
Assign,
Append,
Clear,
None,
Pop,
};
-namespace log_surgeon::finite_automata {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER | |
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER | |
#include <cstdint> | |
#include <log_surgeon/finite_automata/Tag.hpp> | |
enum class RegisterOperation { | |
Assign, | |
Append, | |
Clear, | |
None, | |
Pop, | |
}; | |
namespace log_surgeon::finite_automata { | |
#ifndef LOG_SURGEON_FINITE_AUTOMATA_REGISTER | |
#define LOG_SURGEON_FINITE_AUTOMATA_REGISTER | |
#include <cstdint> | |
#include <vector> | |
#include <log_surgeon/finite_automata/Tag.hpp> | |
namespace log_surgeon::finite_automata { | |
enum class RegisterOperation { | |
Assign, | |
Append, | |
Clear, | |
None, | |
Pop, | |
}; | |
class Register { | ||
public: | ||
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {} | ||
|
||
auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); } | ||
|
||
auto update_last_position(uint32_t const pos) -> void { m_positions.back() = pos; } | ||
|
||
auto negate_last_position() -> void { m_positions.pop_back(); } | ||
|
||
auto negate_all_positions() -> void { m_positions.clear(); } | ||
|
||
[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; } | ||
|
||
[[nodiscard]] auto is_start() const -> bool { return m_is_start; } | ||
|
||
[[nodiscard]] auto get_last_position() const -> uint32_t { return m_positions.back(); } | ||
|
||
[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& { | ||
return m_positions; | ||
} | ||
|
||
private: | ||
Tag* m_tag; | ||
bool m_is_start; | ||
std::vector<uint32_t> m_positions; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add bounds checking and clarify ownership semantics
Several concerns in the Register class implementation:
- Methods accessing
m_positions
don't check for empty vector - Raw pointer ownership semantics aren't clear
- Public methods lack documentation
Apply this diff to add bounds checking:
auto update_last_position(uint32_t const pos) -> void {
+ if (m_positions.empty()) {
+ throw std::out_of_range("Cannot update position in empty register");
+ }
m_positions.back() = pos;
}
auto negate_last_position() -> void {
+ if (m_positions.empty()) {
+ throw std::out_of_range("Cannot negate position in empty register");
+ }
m_positions.pop_back();
}
[[nodiscard]] auto get_last_position() const -> uint32_t {
+ if (m_positions.empty()) {
+ throw std::out_of_range("Cannot get position from empty register");
+ }
return m_positions.back();
}
Consider these improvements:
- Use
std::unique_ptr
or document ownership semantics form_tag
- Add method documentation explaining preconditions and exceptions
- Consider adding a method to check if positions exist before access
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class Register { | |
public: | |
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {} | |
auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); } | |
auto update_last_position(uint32_t const pos) -> void { m_positions.back() = pos; } | |
auto negate_last_position() -> void { m_positions.pop_back(); } | |
auto negate_all_positions() -> void { m_positions.clear(); } | |
[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; } | |
[[nodiscard]] auto is_start() const -> bool { return m_is_start; } | |
[[nodiscard]] auto get_last_position() const -> uint32_t { return m_positions.back(); } | |
[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& { | |
return m_positions; | |
} | |
private: | |
Tag* m_tag; | |
bool m_is_start; | |
std::vector<uint32_t> m_positions; | |
}; | |
class Register { | |
public: | |
explicit Register(Tag* tag, bool const is_start) : m_tag{tag}, m_is_start(is_start) {} | |
auto add_pos(uint32_t const pos) -> void { m_positions.push_back(pos); } | |
auto update_last_position(uint32_t const pos) -> void { | |
if (m_positions.empty()) { | |
throw std::out_of_range("Cannot update position in empty register"); | |
} | |
m_positions.back() = pos; | |
} | |
auto negate_last_position() -> void { | |
if (m_positions.empty()) { | |
throw std::out_of_range("Cannot negate position in empty register"); | |
} | |
m_positions.pop_back(); | |
} | |
auto negate_all_positions() -> void { m_positions.clear(); } | |
[[nodiscard]] auto get_tag() const -> Tag* { return m_tag; } | |
[[nodiscard]] auto is_start() const -> bool { return m_is_start; } | |
[[nodiscard]] auto get_last_position() const -> uint32_t { | |
if (m_positions.empty()) { | |
throw std::out_of_range("Cannot get position from empty register"); | |
} | |
return m_positions.back(); | |
} | |
[[nodiscard]] auto get_all_positions() const -> std::vector<uint32_t> const& { | |
return m_positions; | |
} | |
private: | |
Tag* m_tag; | |
bool m_is_start; | |
std::vector<uint32_t> m_positions; | |
}; |
if (current_register->get_tag() == transition_tag) { | ||
unvisited_stack.emplace( | ||
RegisterOperator{current_register, RegisterOperation::Append}, | ||
dest_state | ||
); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible null pointer dereference of current_register
In the condition if (current_register->get_tag() == transition_tag)
, current_register
could be nullptr
, which may lead to a null pointer dereference. Please ensure current_register
is not nullptr
before dereferencing.
Apply this diff to check for nullptr
before dereferencing:
- if (current_register->get_tag() == transition_tag) {
+ if (current_register != nullptr && current_register->get_tag() == transition_tag) {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (current_register->get_tag() == transition_tag) { | |
unvisited_stack.emplace( | |
RegisterOperator{current_register, RegisterOperation::Append}, | |
dest_state | |
); | |
} else { | |
if (current_register != nullptr && current_register->get_tag() == transition_tag) { | |
unvisited_stack.emplace( | |
RegisterOperator{current_register, RegisterOperation::Append}, | |
dest_state | |
); | |
} else { |
[[nodiscard]] auto get_state() const -> NfaStateTypeTypedDfaState* { return m_nfa_state; } | ||
|
||
bool operator<(RegOpNfaStatePair const& other) const { return m_nfa_state < other.m_nfa_state; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in return type of get_state()
method
The return type NfaStateTypeTypedDfaState*
appears incorrect. It should likely be TypedDfaState*
, matching the template parameter.
Apply this diff to fix the typo:
- [[nodiscard]] auto get_state() const -> NfaStateTypeTypedDfaState* { return m_nfa_state; }
+ [[nodiscard]] auto get_state() const -> TypedDfaState* { return m_nfa_state; }
Committable suggestion skipped: line range outside the PR's diff.
…og-surgeon into refactor-nfa-to-dfa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
CMakeLists.txt (1)
Line range hint
1-1
: Update CMake minimum version format.The CMake version format is deprecated. Update to use the new format to address the pipeline warning.
Apply this diff:
-cmake_minimum_required(VERSION 3.5.1) +cmake_minimum_required(VERSION 3.5...3.27)
♻️ Duplicate comments (1)
src/log_surgeon/finite_automata/DfaState.hpp (1)
60-63
:⚠️ Potential issueAdd bounds checking to prevent out-of-bounds access in
next
methodIn the
next
method, whenstate_type
isStateType::Byte
, accessingm_bytes_transition[character]
without verifying thatcharacter
is within bounds can lead to out-of-bounds access ifcharacter
is greater than or equal tocSizeOfByte
. Adding a bounds check ensures safe access.Apply this diff to add bounds checking:
if constexpr (StateType::Byte == state_type) { + if (character >= cSizeOfByte) { + return nullptr; + } return m_bytes_transition[character]; } else {
🧹 Nitpick comments (3)
src/log_surgeon/finite_automata/NfaState.hpp (2)
213-213
: Use more idiomatic condition.The condition
false == unvisited_stack.empty()
is less readable than!unvisited_stack.empty()
.Apply this diff:
- while (false == unvisited_stack.empty()) { + while (!unvisited_stack.empty()) {
322-322
: Use more idiomatic conditions.Similar readability improvements for the serialization conditions.
Apply this diff:
- if (false == optional_serialized_positive_start_transition.has_value()) { + if (!optional_serialized_positive_start_transition.has_value()) { - if (false == optional_serialized_positive_end_transition.has_value()) { + if (!optional_serialized_positive_end_transition.has_value()) { - if (false == optional_serialized_negative_transition.has_value()) { + if (!optional_serialized_negative_transition.has_value()) {Also applies to: 334-334, 345-345
tests/CMakeLists.txt (1)
Line range hint
1-33
: Address CMake deprecation warningThe pipeline shows a CMake deprecation warning about compatibility with versions < 3.10.
Consider adding a minimum CMake version requirement:
+cmake_minimum_required(VERSION 3.10)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
CMakeLists.txt
(2 hunks)src/log_surgeon/finite_automata/DfaState.hpp
(4 hunks)src/log_surgeon/finite_automata/NfaState.hpp
(1 hunks)src/log_surgeon/finite_automata/StateType.hpp
(1 hunks)tests/CMakeLists.txt
(2 hunks)tests/test-nfa.cpp
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/log_surgeon/finite_automata/StateType.hpp
🧰 Additional context used
📓 Path-based instructions (3)
src/log_surgeon/finite_automata/NfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
tests/test-nfa.cpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
src/log_surgeon/finite_automata/DfaState.hpp (1)
Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}
: - Prefer false == <expression>
rather than !<expression>
.
📓 Learnings (4)
CMakeLists.txt (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
tests/test-nfa.cpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#42
File: src/log_surgeon/finite_automata/RegexNFA.hpp:37-90
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In this codebase, prefer code clarity over efficiency optimizations unless efficiency is a critical concern.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/DfaState.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:127-128
Timestamp: 2024-11-10T16:46:58.543Z
Learning: `RegexNFAUTF8State` is defined as a type alias for `RegexNFAState<RegexNFAStateType::UTF8>`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🪛 GitHub Actions: Build
tests/CMakeLists.txt
[warning] 1-1: CMake Deprecation Warning: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument or use ... syntax.
CMakeLists.txt
[warning] 1-1: CMake Deprecation Warning: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument or use ... syntax.
🔇 Additional comments (11)
tests/test-nfa.cpp (1)
10-31
: Type aliases and includes updated correctlyThe type aliases and includes have been updated appropriately to reflect the refactoring from
RegexNFA
toNfa
. This ensures consistency and correctness in the test code.CMakeLists.txt (3)
71-73
: LGTM! Consistent naming convention applied.The renaming of LALR1Parser files to Lalr1Parser follows a more consistent naming convention.
99-103
: LGTM! Improved modularity with new finite automata components.The addition of separate files for DFA and NFA components improves modularity and separation of concerns.
105-105
: LGTM! StateType abstraction added.The addition of StateType.hpp provides a clean abstraction for state types.
src/log_surgeon/finite_automata/NfaState.hpp (5)
21-30
: LGTM! Well-structured type definitions.The template class and type aliases provide a clean abstraction for different state types.
38-42
: LGTM! Constructor implementation aligns with learnings.The constructors correctly use std::vector for tags as per the retrieved learning from PR#48.
272-296
: Address TODO comment for negative tagged transitions.The commented-out code and TODO indicate incomplete implementation of negative tagged transitions.
Would you like me to help implement the negative tagged transitions handling or create a GitHub issue to track this task?
236-236
:⚠️ Potential issueFix potential null pointer dereference.
The condition
current_register->get_tag() == transition_tag
may dereference a null pointer.Apply this diff:
- if (current_register->get_tag() == transition_tag) { + if (current_register != nullptr && current_register->get_tag() == transition_tag) {Likely invalid or redundant comment.
258-258
:⚠️ Potential issueFix potential null pointer dereference.
Similar null pointer dereference issue in the positive tagged end transition handling.
Apply this diff:
- if (current_register->get_tag() == transition_tag) { + if (current_register != nullptr && current_register->get_tag() == transition_tag) {Likely invalid or redundant comment.
tests/CMakeLists.txt (2)
27-27
: LGTM: Consistent test file namingThe renaming of
test-NFA.cpp
totest-nfa.cpp
follows the lowercase naming convention, which aligns well with the class renaming changes.
14-16
: Verify the inclusion of LALR(1) parser filesThe addition of LALR(1) parser files (
Lalr1Parser.cpp
,.hpp
,.tpp
) appears unrelated to the PR's objective of movingepsilon_closure
andnfa_to_dfa
. Please clarify if these changes are necessary for this refactoring.✅ Verification successful
LALR(1) parser files are correctly included
The LALR(1) parser is a core component that directly uses the NFA/DFA system through template parameters and inheritance. Its inclusion is necessary to maintain compatibility with the refactored finite automata components.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if the LALR(1) parser files are referenced by the modified NFA/DFA classes rg -l "Lalr1Parser" "../src/log_surgeon/finite_automata/"Length of output: 131
Script:
#!/bin/bash # Find the correct paths fd -t f "finite.*automata" src/ fd -t f "Lalr1Parser" src/ # Check for cross-references rg "Lalr1Parser" --type cpp rg "Nfa|Dfa" -g "*Lalr1Parser*" --type cppLength of output: 7833
epsilon_closure
and nfa_to_dfa
from Lexer
class to Dfa
class.
References
Description
Validation performed
Previously existing tests succeed.
Summary by CodeRabbit
Release Notes
Overview
This release introduces significant refactoring and type improvements across the log_surgeon library, focusing on enhancing code clarity and consistency.
Changes
Type Renaming
RegexNFAByteState
toByteNfaState
Code Modernization
auto
return types with explicit trailing return type syntaxNew Features
Register
andRegisterOperator
classes in finite automata moduleDfa
class with improved state managementPerformance
The changes primarily focus on internal code structure and do not introduce significant user-facing modifications.