Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat : Add Microsoft Outlook MSG File Parser #386

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

brianxiadong
Copy link
Contributor

@brianxiadong brianxiadong commented Jan 23, 2025

添加 Microsoft Outlook MSG 文件解析器

Add Microsoft Outlook MSG File Parser

功能描述 | Features

  • 实现了 Microsoft Outlook MSG 文件格式的解析功能
    Implemented parsing functionality for Microsoft Outlook MSG file format
  • 支持将 MSG 文件内容转换为 Spring AI Document 统一格式
    Support converting MSG file content to Spring AI Document unified format
  • 完全兼容 Compound File Binary Format (CFB) 文件结构
    Full compatibility with Compound File Binary Format (CFB) file structure

技术实现 | Technical Implementation

  • 新增 MsgEmailDocumentReader 实现 Spring AI 的 DocumentReader 接口
    Added MsgEmailDocumentReader implementing Spring AI's DocumentReader interface
  • 采用流式处理方式读取 MSG 文件,避免内存溢出
    Implemented streaming processing for MSG files to prevent memory overflow
  • 实现了完整的错误处理和日志记录机制
    Implemented comprehensive error handling and logging mechanism
  • 提供优雅的资源关闭处理
    Provided elegant resource cleanup handling

主要类说明 | Core Classes

  • MsgEmailDocumentReader: MSG 文件核心解析器类
    Core parser class for MSG files
  • MsgParser: MSG 文件结构解析器
    MSG file structure parser
  • MsgEmailElement: MSG 邮件元素数据模型
    Data model for MSG email elements
  • MsgEmailParser: 邮件内容解析和转换工具
    Email content parsing and conversion utility

测试覆盖 | Test Coverage

  • 添加了单元测试类 MsgEmailDocumentReaderTest
    Added unit test class MsgEmailDocumentReaderTest
  • 包含异常场景测试用例
    Included test cases for exception scenarios
  • 提供测试资源文件
    Provided test resource files

依赖说明 | Dependencies

  • 基于 Spring AI 框架
    Based on Spring AI framework
  • 使用 Apache Commons IO 工具库
    Using Apache Commons IO utility library
  • 使用 SLF4J 日志框架
    Using SLF4J logging framework

fixes #293

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add document mail platform Email
1 participant