...效运行的任务。事件驱动编程 , 事件驱动编程是一种编程范式，它基于“事件”这一核心概念，程序的执行流程由事件触发。在Node.js中，事件驱动机制意味着当某个特定事件（如网络连接建立、数据接收完毕等）发生时，会触发相应的回调函数进行处理，而不是等待整个任务线性执行完毕。这种模型允许Node.js能够同时处理多个并发请求，实现非阻塞I/O操作，极大地提升了服务端应用程序的性能和效率。回调函数 , 回调函数是作为参数传递给另一个函数的函数，这个函数会在预定条件满足或特定事件发生时被调用。在Node.js异步编程中，回调函数尤为常见，例如HTTP请求完成后的响应处理。文章中的http.get()方法就接受一个回调函数作为参数，该函数在HTTP请求完成后被执行，从而实现了异步处理。当在错误处理或数据流事件（如 data 和 end ）上设置回调函数时，可以确保相关逻辑在合适的时机得到执行，而不会阻塞主线程的其他任务。

2023-03-20 14:09:08

121

雪域高原-t

Superset

Superset 数据源连接配置：精细化自定义SQLAlchemy URI实现数据分析与可视化，含SSL加密连接实例

...展身手，自由定制数据连接配置。就像在玩乐高积木一样，你可以自定义SQLAlchemy URI设置，想怎么拼就怎么拼！本文将带您深入探索这一功能，通过实例详解如何在Superset中自定义SQLAlchemy URI，以满足您特定的数据源连接需求。 1. SQLAlchemy与URI简介首先，我们来快速了解一下SQLAlchemy以及其URI（Uniform Resource Identifier）的概念。SQLAlchemy，这可是Python世界里鼎鼎大名的关系型数据库操作工具，大家都抢着用。而URI呢，你可以理解为一个超级实用的“地址条”，它用一种统一格式的字符串，帮我们精准定位并解锁访问数据库资源的各种路径和方式，是不是很给力？在Superset中，我们通过配置SQLAlchemy URI来建立与各种数据库（如MySQL、PostgreSQL、Oracle等）的连接。例如，一个基本的PostgreSQL的SQLAlchemy URI可能看起来像这样： python postgresql://username:password@host:port/database 这里的各个部分分别代表数据库用户名、密码、主机地址、端口号和数据库名。 2. Superset中的SQLAlchemy URI设置在Superset中，我们可以在“Sources” -> “Databases”页面添加或编辑数据源时，自定义SQLAlchemy URI。下面让我们一步步揭开这个过程： 2.1 添加新的数据库连接 (1) 登录到您的Superset后台管理界面，点击左侧菜单栏的"Sources"，然后选择"Databases"。 (2) 点击右上角的"+"按钮，开始创建一个新的数据库连接。 (3) 在弹出的表单中，选择适合您的数据库引擎类型，如"PostgreSQL"，并在"Database Connection URL"字段中填写您的自定义SQLAlchemy URI。 2.2 示例代码假设我们要连接到一台本地运行的PostgreSQL数据库，用户名为superset_user，密码为secure_password，端口为5432，数据库名为superset_db，则对应的SQLAlchemy URI如下： python postgresql://superset_user:secure_password@localhost:5432/superset_db 填入上述信息后，点击"Save"保存设置，Superset便会使用该URI与指定的数据库建立连接。 2.3 进阶应用对于一些需要额外参数的数据库（比如SSL加密连接、指定编码格式等），可以在URI中进一步扩展： python postgresql://superset_user:secure_password@localhost:5432/superset_db?sslmode=require&charset=utf8 这里，sslmode=require指定了启用SSL加密连接，charset=utf8则设置了字符集。 3. 思考与探讨在实际应用场景中，灵活运用SQLAlchemy URI的自定义能力，可以极大地增强Superset的数据源兼容性与安全性。甭管是云端飘着的RDS服务，还是公司里头自个儿搭建的各种数据库系统，只要你摸准了那个URI构造的门道，咱们就能轻轻松松把它们拽进Superset这个大舞台，然后麻溜儿地对数据进行深度分析，再活灵活现地展示出来，那感觉倍儿爽！在面对复杂的数据库连接问题时，别忘了查阅SQLAlchemy官方文档以获取更多关于URI配置的细节和选项，同时结合Superset的强大功能，定能让您的数据驱动决策之路更加顺畅！总的来说，掌握并熟练运用自定义SQLAlchemy URI的技巧，就像是赋予了Superset一把打开任意数据宝库的钥匙，无论数据藏于何处，都能随心所欲地进行探索挖掘。这就是Superset的魅力所在，也是我们在数据科学道路上不断求索的动力源泉！

2024-03-19 10:43:57

红尘漫步

SpringBoot

SpringBoot应用中处理MySQL数据库版本兼容性：部署时的迁移工具与配置检查实践

...，假设我们现在用的是MySQL 5.6版本的数据库，但咱们的应用程序却偷偷依赖了MySQL 5.7里的一些新功能。这样的话，就极有可能会闹点儿小矛盾，出点问题。三、解决方案那么，当我们在部署到某些数据库版本时出现问题时，我们应该如何解决呢？首先，我们需要检查我们的应用程序是否与目标数据库版本兼容。这可以通过查看应用程序的配置文件或者依赖关系来完成。比如，我们可以翻翻pom.xml这个配置文件，瞅瞅里面的依赖项是不是对某个特定的数据库版本提供了支持。其次，如果我们的应用程序确实需要使用某些只在新版本数据库中提供的功能，那么我们需要更新我们的数据库。这可以通过使用数据库迁移工具来完成。例如，我们可以使用Flyway或者Liquibase这样的工具，将旧版本的数据库升级到新版本。最后，如果我们不能更新数据库，那么我们可以考虑修改我们的应用程序代码，使其能够在旧版本数据库上运行。这可能意味着咱们得采取一些特别的手段，比如说，别去碰那些新潮的数据库功能，或者亲自动手编写额外的代码，来仿造这些特性的工作方式。就像是玩乐高积木一样，有时候我们不能用最新的配件，反而需要自己动手拼接出相似的部件来满足需求。四、代码示例接下来，我将以一个简单的示例来演示如何在SpringBoot应用程序中使用数据库迁移工具。假设我们有一个名为User的实体类，我们想要将其保存到数据库中。 java @Entity @Table(name = "users") public class User { @Id @GeneratedValue(strategy = GenerationType.AUTO) private Long id; @Column(nullable = false) private String name; // getters and setters } 然后，我们需要创建一个SpringBoot应用程序，并添加Spring Data JPA和HSQLDB依赖。 xml org.springframework.boot spring-boot-starter-data-jpa org.hsqldb hsqldb runtime 接着，我们需要创建一个application.properties文件，配置数据库连接信息。 properties spring.datasource.url=jdbc:hsqldb:mem:testdb spring.datasource.driverClassName=org.hsqldb.jdbcDriver spring.datasource.username=sa spring.datasource.password= spring.jpa.hibernate.ddl-auto=create 然后，我们需要创建一个UserRepository接口，定义CRUD操作方法。 java public interface UserRepository extends JpaRepository { } 最后，我们可以在控制器中调用UserRepository的方法，将用户保存到数据库中。 java @RestController public class UserController { private final UserRepository userRepository; public UserController(UserRepository userRepository) { this.userRepository = userRepository; } @PostMapping("/users") public ResponseEntity createUser(@RequestBody User user) { userRepository.save(user); return ResponseEntity.ok().build(); } } 以上就是使用SpringBoot进行数据库迁移的基本步骤。这样子做，我们就能轻轻松松地管理、更新咱们的数据库，确保我们的应用程序能够像老黄牛一样稳稳当当地运行起来，一点儿都不带出岔子的。

2023-12-01 22:15:50

夜色朦胧_t

Sqoop

Sqoop 在 Hadoop 生态系统中的关系型数据库数据迁移：并行导入导出与增量加载至 Hive 和 Oracle 实践

...--connect jdbc:mysql://localhost:3306/mydatabase \ --username myuser \ --password mypassword \ --table mytable \ --target-dir /user/hadoop/mytable_imported 这个命令展示了如何从MySQL数据库导入mytable表到HDFS的/user/hadoop/mytable_imported目录下。 2. Sqoop工作原理及功能特性 (此处详细描述Sqoop的工作原理，如并行导入导出、自动生成Java类、分区导入等特性) 2.1 并行导入示例 Sqoop利用MapReduce模型实现并行数据导入，大幅提高数据迁移效率。 shell sqoop import --num-mappers 4 ... 此命令设置4个map任务并行执行数据导入操作。 3. Sqoop的基本使用（这里详细说明Sqoop的各种命令，包括import、export、create-hive-table等，并给出实例） 3.1 Sqoop Import 实例详解 shell 示例：将Oracle表同步至Hive表 sqoop import \ --connect jdbc:oracle:thin:@//hostname:port/service_name \ --username username \ --password password \ --table source_table \ --hive-import \ --hive-table target_table 这段代码演示了如何将Oracle数据库中的source_table直接导入到Hive的target_table。 4. Sqoop高级应用与实践问题探讨（这部分深入探讨Sqoop的一些高级用法，如增量导入、容错机制、自定义连接器等，并通过具体案例阐述） 4.1 增量导入策略 shell 使用lastmodified或incremental方式实现增量导入 sqoop import \ --connect ... \ --table source_table \ --check-column id \ --incremental lastmodified \ --last-value 这段代码展示了如何根据最后一次导入的id值进行增量导入。 5. Sqoop在实际业务场景中的应用与挑战（在这部分，我们可以探讨Sqoop在真实业务环境下的应用场景，以及可能遇到的问题及其解决方案）以上仅为大纲及部分内容展示，实际上每部分都需要进一步拓展、深化和情感化的表述，使读者能更好地理解Sqoop的工作机制，掌握其使用方法，并能在实际工作中灵活运用。为了达到1000字以上的要求，每个章节都需要充实详尽的解释、具体的思考过程、理解难点解析以及更多的代码实例和应用场景介绍。

2023-02-17 18:50:30

130

雪域高原

Beego

Beego框架下数据库操作与HTTP请求性能优化：连接池、SQL优化及缓存、懒加载实践

...几点： 3.1 使用连接池通过创建连接池，我们可以预先分配一定数量的数据库连接，这样在需要时就可以直接从连接池中获取，避免了每次请求都新建连接的过程，从而提高了性能。 go import "github.com/go-sql-driver/mysql" func init() { db, err := sql.Open("mysql", "root:password@/test?charset=utf8") if err != nil { panic(err) } pool := &sql.Pool{MaxOpenConns: 50, MaxIdleConns: 20, DSN: db.DSN} db.Close() db = pool.Get() defer db.Close() } 3.2 合理设置SQL语句合理的SQL语句能够提高查询效率。比如，咱们在查数据库的时候，尽量别动不动就用“SELECT ”，那可就像大扫荡一样全给捞出来，咱应该更有针对性地只挑选真正需要的字段。对于那些复杂的查询操作，咱得多开动脑筋利用索引这个神器，让它发挥出应有的作用，这样查询速度嗖嗖的，效率杠杠的！四、优化HTTP请求处理 HTTP请求处理是Web应用的核心部分，也是性能优化的重点。Beego提供了路由、中间件等功能，可以帮助我们优化HTTP请求处理。 4.1 使用缓存如果某些数据不需要频繁更新，我们可以考虑将其存储在缓存中。这样一来，下回需要用到的时候，咱们就能直接从缓存里把信息拽出来用，就不用再去数据库翻箱倒柜地查询了。这招能大大提升咱们的运行效率！ go import "github.com/go-redis/redis/v7" var client redis.Client func init() { var err error client, err = redis.NewClient(&redis.Options{ Addr: "localhost:6379", Password: "", DB: 0, }) if err != nil { panic(err) } } func GetCache(key string) interface{} { val, err := client.Get(key).Result() if err == redis.Nil { return nil } else if err != nil { panic(err) } return val } func SetCache(key string, value interface{}) { _, err := client.Set(key, value, 0).Result() if err != nil { panic(err) } } 4.2 懒加载对于一些不常用的数据，我们可以考虑采用懒加载的方式。只有当用户确实有需求，急需这些数据的时候，我们才会去加载，这样一来，既能避免不必要的网络传输，又能嗖嗖地提升整体性能。五、总结通过上述方法，我们可以在一定程度上提高Beego的性能。但是，性能优化这件事儿可不是一蹴而就的，它需要我们在日常开发过程中不断尝试、不断摸索，像探宝一样去积累经验，才能慢慢摸出门道来。同时，咱们也要留个心眼儿，别光顾着追求性能优化，万一过了头，可能还会惹出些别的麻烦来，比如代码变得复杂得像团乱麻，维护起来也更加头疼。所以说呢，咱们得根据实际情况，做出最接地气、最明智的选择。

2024-01-18 18:30:40

537

清风徐来-t

Superset

Superset与Apache Kafka联动：实现实时流数据摄取至可视化图表的集成实践及数据一致性完整性探讨

...配置Superset连接到Kafka数据源。这通常需要咱们用类似“kafka-python”这样的工具箱，从Kafka的主题里边捞出数据来，然后把这些数据塞到Superset能支持的数据仓库里，比如PostgreSQL或者MySQL这些数据库。例如： python from kafka import KafkaConsumer import psycopg2 创建Kafka消费者 consumer = KafkaConsumer('your-topic', bootstrap_servers=['localhost:9092']) 连接数据库 conn = psycopg2.connect(database="your_db", user="your_user", password="your_password", host="localhost") cur = conn.cursor() for message in consumer: 解析并处理Kafka消息 data = process_message(message.value) 将数据写入数据库 cur.execute("INSERT INTO your_table VALUES (%s)", (data,)) conn.commit() (2) Superset数据源配置：在成功将Kafka数据导入到数据库后，需要在Superset中添加对应的数据库连接。打开Superset的管理面板，就像装修房子一样，咱们得设定一个新的SQLAlchemy链接地址，让它指向你的数据库。想象一下，这就是给Superset指路，让它能够顺利找到并探索你刚刚灌入的那些Kafka数据宝藏。 (3) 创建可视化图表：最后，你可以在Superset中创建新的 charts 或仪表板，利用SQL Lab查询刚刚配置好的数据库，从而实现对Kafka实时流数据的可视化展现。 5. 实践思考与探讨将Superset与Apache Kafka集成的过程并非一蹴而就，而是需要根据具体业务场景灵活设计数据流转和处理流程。咱们不光得琢磨怎么把Kafka那家伙产生的实时数据，嗖嗖地塞进关系型数据库里头，同时还得留意，在不破坏数据“新鲜度”的大前提下，确保这些数据的完整性和一致性，可马虎不得啊！另外，在使用Superset的时候，咱们可得好好利用它那牛哄哄的数据透视和过滤功能，这样一来，甭管业务分析需求怎么变，都能妥妥地满足它们。总结来说，Superset与Apache Kafka的结合，如同给实时数据流插上了一双翅膀，让数据的价值得以迅速转化为洞见，驱动企业快速决策。在这个过程中，我们将不断探索和优化，以期在实践中发掘更多可能。

2023-10-19 21:29:53

301

青山绿水

Tomcat

Tomcat配置详解：Servlet映射与过滤器初始化参数

...如说，你可以把数据库连接字符串和API密钥这些敏感信息放到初始化参数里。这样一来，不仅管理起来更方便，还能提高安全性，简直是一举两得！示例如下： xml dbUrl jdbc:mysql://localhost:3306/mydb 在这个例子中，我们定义了一个名为dbUrl的上下文参数，其值为MySQL数据库的连接字符串。在Servlet或过滤器中可以通过getServletContext().getInitParameter("dbUrl")来获取该值。三、总结让Tomcat更懂你的需求好了，朋友们，今天我们一起探索了web.xml文件的重要性及其在Tomcat中的作用。通过调整Servlet映射、设置过滤器和初始化参数，我们可以让Tomcat更懂我们的应用逻辑，更好地帮我们跑起来。记住，就像盖房子一样，提前做好规划和设计能让结果既高效又好看！希望这篇文章能帮助你在构建Web应用的过程中更加得心应手！ --- 希望这篇技术文章能够让你感受到编写Web应用的乐趣，并且对你理解Tomcat及web.xml文件有所帮助。如果有任何问题或想要进一步探讨的内容，请随时留言交流！

2024-11-23 16:20:14

山涧溪流

Sqoop

Sqoop与Apache Atlas联动实现元数据管理：数据迁移、Sqoop Hook与数据全生命周期实践

...从关系型数据库（例如MySQL）中将数据迁移到HDFS： bash sqoop import \ --connect jdbc:mysql://localhost:3306/mydatabase \ --username myuser --password mypassword \ --table mytable \ --target-dir /user/hadoop/sqoop_imports/mytable \ --as-parquetfile 上述代码片段展示了Sqoop的基本用法，通过指定连接参数、认证信息、表名以及目标目录，实现从MySQL到HDFS的数据迁移，并以Parquet格式存储。 3. Apache Atlas元数据管理简介 Apache Atlas利用实体-属性-值模型来描述数据资产，可以自动捕获并记录来自各种数据源（包括Sqoop导入导出作业）的元数据。比方说，当Sqoop这家伙在吭哧吭哧执行导入数据的任务时，Atlas就像个超级侦探，不仅能快速抓取到表结构、字段这些重要信息，还能顺藤摸瓜追踪到数据的“亲缘关系”和它可能产生的影响分析，真可谓火眼金睛啊。 4. Sqoop与Apache Atlas的联动实践联动原理： Sqoop与Atlas的联动主要基于Sqoop hooks机制。用大白话说，Sqoop hook就像是一个神奇的工具，它让我们在搬运数据的过程中，能够按照自己的心意插播一些特别的操作。具体怎么玩呢？就是我们可以通过实现一些特定的接口功能，让Sqoop在忙活着导入或者导出数据的时候，顺手给Atlas发送一条“嘿，我这儿数据有变动，元数据记得更新一下”的消息通知。联动配置与示例：为了实现Sqoop与Atlas的联动，我们需要配置并启用Atlas Sqoop Hook。以下是一个基本的配置示例： xml sqoop.job.data.publish.class org.apache.atlas.sqoop.hook.SqoopHook 这段配置告知Sqoop使用Atlas提供的hook类来处理元数据发布。当Sqoop作业运行时，SqoopHook会自动收集作业相关的元数据，并将其同步至Apache Atlas。 5. 结合实战场景探讨Sqoop与Atlas联动的价值有了Sqoop与Atlas的联动能力，我们的数据工程师不仅能快速便捷地完成数据迁移，还能确保每一步操作都伴随着完整的元数据记录。比如，当业务人员查询某数据集来源时，可通过Atlas直接追溯到原始的Sqoop作业；或者在数据质量检查、合规审计时，可以清晰查看到数据血缘链路，从而更好地理解数据的生命历程，提高决策效率。 6. 总结 Sqoop与Apache Atlas的深度集成，犹如为大数据环境中的数据流动加上了一双明亮的眼睛和智能的大脑。它们不仅简化了数据迁移过程，更强化了对数据全生命周期的管理与洞察力。随着企业越来越重视并不断深挖数据背后的宝藏，这种联动解决方案将会在打造一个既高效、又安全、完全合规的数据管理体系中，扮演着越来越关键的角色。就像是给企业的数据治理装上了一个超级引擎，让一切都运作得更顺畅、更稳妥、更符合规矩。

2023-06-02 20:02:21

119

月下独酌

转载文章

[转载]FMS3 客户端call服务器端

...将一个flash文件连接到一个服务器端的脚本，别且如何从服务器获取数据。在这个例子里面，flash用户界面有一个Button组件（其实例名称是bt）和一个lebel组件(其实例名称是txt)。当一个用户点击Button，客户端连接到服务器；然后客户端运行服务器端的函数来返回一个字符串的值。当服务器端回应了，客户端的回应函数在label上显示字符传。客户端通过改变Button的label来断开连接。当diaconnect的按钮被点击，客户端断开连接，并且清空label。 ONE.创建用户界面 1.开启Flash CS3，然后选择新建>flash文件（ActionScript 3.0）。 2.选择窗口>组件，然后选择User Interface>Button。在属性栏里面为按钮取名bt。 3.添加一个Label组件，移动它到按钮上面，取名为txt。保存文件为test.fla。 TWO.建立as文件。输入以下代码： package { import flash.display.MovieClip; import flash.events.MouseEvent; import flash.events.NetStatusEvent; import flash.net.NetConnection; import flash.net.Responder; public class Main extends MovieClip { public var nc:NetConnection; public var myRespond:Responder; public function Main():void { txt.text=""; bt.label="请点击链接"; myRespond=new Responder(success,failed); bt.addEventListener(MouseEvent.CLICK,clickHandler); } private function clickHandler(e:MouseEvent) { if (bt.label=="请点击链接") { bt.label="请点击断开"; nc=new NetConnection(); nc.connect("rtmp://localhost/viniFMS"); nc.addEventListener(NetStatusEvent.NET_STATUS,statusHandler); nc.call("sayServermsg",myRespond,"Hi"); } else { txt.text=""; bt.label="请点击链接"; nc.close(); } } private function statusHandler(e:NetStatusEvent) { if (e.info.code=="NetConnection.Connect.Success") { trace("ok"); } } private function success(result:Object) { trace("成功："+result.toString()); txt.text=result.toString(); } private function failed(result:Object) { trace("失败："+result.toString()); } } } 将as文件保存为Main.as 在test.fla的属性那的文档类输入Main。保存。 Three:建立通讯文件（.asc） 1.选择文件>新建>actionscript通信文件。输入以下代码： application.onConnect=function(client){ application.acceptConnection(client); client.sayServermsg=function(msg){ return msg+",欢迎你来到FMS的世界！"; } } 将文件保存到fms的application的文件夹下的viniFMS文件夹下，文件名为：main.asc. 确保FMS的服务已经打开，80端口没有被php等占用。然后运行flash，点击按钮。就会有结果出现了。如下图所示。再点击按钮。关闭连接。再点就是打开。如此循环。客户端会得到服务器端返回的数据。一个客户端用actionscript编码来连接到服务器，处理事件，和做其它工作。通过flash CS3你可以使用actionscript 3.0,2.0或1.0，但是actionscript3.0提供更多特性。要想使用flex，你必须使用actionscript 3.0. Actionscript3.0显著的不同于actionscript 2.0。这个向导假设你是在正在编写actionscript 3.0的类，这些类是一些外部的.as文件，有符合你的开发环境的目录结构的包的名称转载于:https://blog.51cto.com/vini123/681426 本篇文章为转载内容。原文链接：https://blog.csdn.net/weixin_33895475/article/details/91647859。该文由互联网用户投稿提供，文中观点代表作者本人意见，并不代表本站的立场。作为信息平台，本站仅提供文章转载服务，并不拥有其所有权，也不对文章内容的真实性、准确性和合法性承担责任。如发现本文存在侵权、违法、违规或事实不符的情况，请及时联系我们，我们将第一时间进行核实并删除相应内容。

2023-09-10 18:10:29

转载

SeaTunnel

数据库事务提交失败：数据同步中网络连接与资源管理问题分析

...原因引起： - 网络连接问题：数据传输过程中出现网络中断。 - 资源不足：数据库服务器资源不足，如内存、磁盘空间等。 - 锁争用：并发操作导致锁定冲突。 - SQL语句错误：提交的SQL语句存在语法错误或逻辑错误。 3.2 如何解决？既然已经找到了潜在的原因，那么接下来就是解决问题的关键环节了。我们可以从以下几个方面入手： - 检查网络连接：确保数据源与目标数据库之间的网络连接稳定可靠。 - 优化资源管理：增加数据库服务器的资源配额，确保有足够的内存和磁盘空间。 - 避免锁争用：合理安排并发操作，减少锁争用的可能性。 - 验证SQL语句：仔细检查提交的SQL语句，确保其正确无误。 4. 实战演练为了更好地理解这些问题，我们可以通过一些实际的例子来进行演练。下面我会给出几个具体的代码示例，帮助大家更好地理解和解决问题。 4.1 示例一：处理网络连接问题 java // 这是一个简单的配置文件示例，用于指定数据源和目标数据库 { "source": { "type": "jdbc", "config": { "url": "jdbc:mysql://source_host:port/source_db", "username": "source_user", "password": "source_password" } }, "sink": { "type": "jdbc", "config": { "url": "jdbc:mysql://target_host:port/target_db", "username": "target_user", "password": "target_password" } } } 4.2 示例二：优化资源管理 java // 通过调整配置文件中的参数，增加数据库连接池的大小 { "source": { "type": "jdbc", "config": { "url": "jdbc:mysql://source_host:port/source_db", "username": "source_user", "password": "source_password", "connectionPoolSize": 50 // 增加连接池大小 } }, "sink": { "type": "jdbc", "config": { "url": "jdbc:mysql://target_host:port/target_db", "username": "target_user", "password": "target_password", "connectionPoolSize": 50 // 增加连接池大小 } } } 4.3 示例三：避免锁争用 java // 在配置文件中添加适当的并发控制策略 { "source": { "type": "jdbc", "config": { "url": "jdbc:mysql://source_host:port/source_db", "username": "source_user", "password": "source_password" } }, "sink": { "type": "jdbc", "config": { "url": "jdbc:mysql://target_host:port/target_db", "username": "target_user", "password": "target_password", "concurrency": 10 // 设置并发度 } } } 4.4 示例四：验证SQL语句 java // 在配置文件中明确指定要执行的SQL语句 { "source": { "type": "sql", "config": { "sql": "SELECT FROM source_table" } }, "sink": { "type": "jdbc", "config": { "url": "jdbc:mysql://target_host:port/target_db", "username": "target_user", "password": "target_password", "table": "target_table", "sql": "INSERT INTO target_table (column1, column2) VALUES (?, ?)" } } } 5. 总结与展望在这次探索中，我们不仅学习了如何处理数据库事务提交失败的问题，还了解了如何通过实际操作来解决这些问题。虽然在这个过程中遇到了不少挑战，但正是这些挑战让我们成长。未来，我们将继续探索更多关于数据集成和处理的知识，让我们的旅程更加丰富多彩。希望这篇技术文章能够帮助你在面对类似问题时有更多的信心和方法。如果你有任何疑问或建议，欢迎随时与我交流。让我们一起加油，不断进步！

2025-02-04 16:25:24

111

半夏微凉

Datax

Datax数据同步中的安全性实践：传输加密、认证授权与敏感信息保护机制详解

... "name": "mysqlreader", "parameter": { "username": "root", "password": "", "connection": [ { "jdbcUrl": ["jdbc:mysql://source-db:3306/mydb?useSSL=true&serverTimezone=UTC"], "table": ["table1"] } ], // 配置SSL以保证数据传输安全 "connectionProperties": "useSSL=true" } }, "writer": {...} } ], "setting": { // ... } } } 上述示例中，我们在配置MySQL读取器时启用了SSL连接，这是Datax保障数据传输安全的第一道防线。 2. 认证与授权 Datax服务端及各数据源间的认证与授权也是保障安全的重要一环。Datax本身并不内置用户权限管理功能，而是依赖于各个数据源自身的安全机制。例如，我们可以通过配置数据库的用户名和密码实现访问控制： json "reader": { "name": "mysqlreader", "parameter": { "username": "datax_user", // 数据库用户 "password": "", // 密码 // ... } } 在此基础上，企业内部可以结合Kerberos或LDAP等统一身份验证服务进一步提升Datax作业的安全性。 3. 敏感信息处理 Datax配置文件中通常会包含数据库连接信息、账号密码等敏感内容。为防止敏感信息泄露，Datax支持参数化配置，通过环境变量或者外部化配置文件的方式避免直接在任务配置中硬编码敏感信息： json "reader": { "name": "mysqlreader", "parameter": { "username": "${db_user}", "password": "${}", // ... } } 然后在执行Datax任务时，通过命令行传入环境变量： bash export db_user='datax_user' && export db_password='' && datax.py /path/to/job.json 这种方式既满足了安全性要求，也便于运维人员管理和分发任务配置。 4. 审计与日志记录 Datax提供详细的运行日志功能，包括任务启动时间、结束时间、状态以及可能发生的错误信息，这对于后期审计与排查问题具有重要意义。同时呢，我们可以通过企业内部那个专门用来收集和分析日志的平台，实时盯着Datax作业的执行动态，一旦发现有啥不对劲的地方，就能立马出手解决，保证整个流程顺顺利利的。综上所述，Datax的安全性设计涵盖了数据传输安全、认证授权机制、敏感信息处理以及操作审计等多个层面。在用Datax干活的时候，咱们得把这些安全策略整得明明白白、运用自如。只有这样，才能一边麻溜儿地完成数据同步任务，一边稳稳当当地把咱的数据资产保护得严严实实，一点儿风险都不冒。这就像是现实生活里的锁匠师傅，不仅要手到擒来地掌握开锁这门绝活儿，更得深谙打造铜墙铁壁般安全体系的门道，确保我们的“数据宝藏”牢不可破，固若金汤。

2024-01-11 18:45:57

1143

蝶舞花间

Netty

Netty服务器应对网络中断：ChannelFuture、FutureListener及心跳检测与重连机制的实践应用

...议应用程序的异步事件驱动模型。Netty这个家伙，它可是搭建在NIO（非阻塞式输入输出）这个强大基石上的，这样一来，它能够在单个线程里边同时应对多个连接请求，大大提升了程序处理并发任务的能力，让效率噌噌噌地往上涨。三、Netty服务器的网络中断问题当网络发生中断时，Netty服务器通常会产生两种异常： 1. ChannelException: 由于底层I/O操作失败而抛出的异常。 2. UnresolvedAddressException: 当尝试打开一个到不存在的地址的连接时抛出的异常。这两种异常都会导致服务器无法正常接收和发送数据。四、处理Netty服务器的网络中断问题 1. 使用ChannelFuture和FutureListener 在Netty中，我们可以使用ChannelFuture和FutureListener来处理网络中断问题。ChannelFuture是创建了一个用于等待特定I/O操作完成的Future对象。FutureListener是一个接口，可以监听ChannelFuture的状态变化。例如，我们可以使用以下代码来监听一个ChannelFuture的状态变化： java channelFuture.addListener(new FutureListener() { @Override public void operationComplete(ChannelFuture future) throws Exception { if (future.isSuccess()) { // 连接成功 } else { // 连接失败 } } }); 2. 使用心跳检测机制除了监听ChannelFuture的状态变化外，我们还可以使用心跳检测机制来检查网络是否中断。实际上，我们可以这样理解：在用户的设备上（也就是客户端），我们设定一个任务，定期给服务器发送个“招呼”——这就是所谓的心跳包。就像朋友之间互相确认对方是否还在一样，如果服务器在一段时间内没有回应这个“招呼”，那我们就推测可能是网络连接断开了，简单来说就是网络出小差了。例如，我们可以使用以下代码来发送心跳包： java // 创建心跳包 ByteBuf heartbeat = Unpooled.buffer(); heartbeat.writeInt(HeartbeatMessage.HEARTBEAT); heartbeat.writerIndex(heartbeat.readableBytes()); // 发送心跳包 channel.writeAndFlush(heartbeat); 3. 使用重连机制当网络中断后，我们需要尽快重新建立连接。为了实现这个功能，我们可以使用重连机制。换句话说，一旦网络突然掉线了，我们立马麻溜地开始尝试建立一个新的连接，并且持续密切关注着新的连接状态有没有啥变化。例如，我们可以使用以下代码来重新建立连接： java // 重试次数 int retryCount = 0; while (retryCount < maxRetryCount) { try { // 创建新的连接 Bootstrap bootstrap = new Bootstrap(); ChannelFuture channelFuture = bootstrap.group(eventLoopGroup).channel(NioServerSocketChannel.class) .option(ChannelOption.SO_BACKLOG, backlog) .childHandler(new ServerInitializer()) .connect(new InetSocketAddress(host, port)).sync(); // 监听新的连接状态变化 channelFuture.addListener(new FutureListener() { @Override public void operationComplete(ChannelFuture future) throws Exception { if (future.isSuccess()) { // 新的连接建立成功 return; } // 新的连接建立失败，继续重试 if (future.cause() instanceof ConnectException || future.cause() instanceof UnknownHostException) { retryCount++; System.out.println("Failed to connect to server, will retry in " + retryDelay + "ms"); Thread.sleep(retryDelay); continue; } } }); // 连接建立成功，返回 return channelFuture.channel(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } 五、总结在网络中断问题上，我们可以通过监听ChannelFuture的状态变化、使用心跳检测机制和重连机制来处理。这些方法各有各的好和不足，不过总的来说，甭管怎样，它们都能在关键时刻派上用场，就是在网络突然断开的时候，帮我们快速重新连上线，确保服务器稳稳当当地运行起来，一点儿不影响正常工作。以上就是关于如何处理Netty服务器的网络中断问题的文章，希望能对你有所帮助。

2023-02-27 09:57:28

137

梦幻星空-t

转载文章

[转载]mysql profile 导出_MySQL数据的导出和导入工具:mysqldump_MySQL

mysqldump , mysqldump是MySQL数据库系统自带的一个用于备份数据库的实用工具，它能够生成SQL语句来表示选定数据库或表的结构以及数据。通过命令行界面调用此工具时，用户可以指定一系列选项来自定义导出行为，如是否包含表创建语句、锁定表以保证一致性、添加删除表的语句、压缩输出等。在本文中，mysqldump被详细介绍为一种进行数据库迁移、备份和恢复的关键手段。 INSERT DELAYED , INSERT DELAYED 是MySQL数据库中的一个插入选项，当与mysqldump结合使用时（通过--delayed选项），它可以将INSERT语句放入队列而不是立即执行，尤其适用于高并发写入场景。这种机制使得MySQL服务器在处理其他查询的同时逐渐处理这些延迟插入的行，从而提高整体性能。然而，需要注意的是，INSERT DELAYED不适用于InnoDB存储引擎。 TCP/IP端口指定连接 , 在MySQL数据库环境中，TCP/IP端口指定连接是指在使用mysqldump或其他客户端工具连接到MySQL服务器时，可以通过-P 或 --port 选项指定服务器监听的特定TCP/IP端口号。默认情况下，MySQL服务器通常在本地主机上监听3306端口，但在某些情况下，可能需要根据实际配置更改端口号以便正确建立连接。 LOAD DATA INFILE , LOAD DATA INFILE是MySQL提供的一种高效的数据导入方式，允许从文本文件快速地将大量数据加载到表中。在文章中提到的mysqldump的几个选项（如--fields-terminated-by, --fields-enclosed-by等）就是用来配合LOAD DATA INFILE语句，在导出数据时确保其格式与LOAD DATA INFILE所需的格式相匹配，便于后续快速导入数据。尽管在文中没有直接演示如何使用LOAD DATA INFILE，但这些选项的存在意味着导出的数据可以方便地用于该命令的导入操作。 MySQL客户端管道操作 , MySQL客户端管道操作是一种利用操作系统提供的管道功能，将mysqldump导出的SQL语句流式传输至另一个MySQL客户端（如mysql命令行工具），进而实现将数据从一个数据库导入到另一个数据库的过程。在本文中，展示了如何通过管道操作将mysqldump导出的SQL语句直接导入到远程MySQL服务器上的目标数据库中，这样既能减少磁盘I/O开销，又能提高数据迁移效率。例如，mysqldump --opt database | mysql --host=remote-host -C database就是一条典型的利用管道将数据从本地数据库迁移到远程数据库的命令。

2023-02-01 23:51:06

265

转载

Saiku

Saiku系统恢复：备份与故障转移不足需改进

...个脚本的作用就是调用MySQL的dump命令，生成数据库的备份文件。这样就不用担心忘记备份了，挺方便的。 bash 编辑crontab crontab -e 添加如下行，每周日凌晨两点执行一次备份 0 2 0 /usr/bin/mysqldump -u username -p'password' database_name > /path/to/backup/db_backup_$(date +\%Y\%m\%d).sql 4. 恢复策略的设计现在我们已经了解了为什么需要一个好的恢复计划，接下来谈谈如何设计这样一个计划。首先，你需要明确哪些数据是最关键的。然后，根据这些数据的重要程度制定相应的恢复策略。比如说，如果你每天都在更新的数据，那就得时不时地备份一下，甚至可以每一小时就来一次。但如果是那种好几天都不动弹的数据，那就可以放宽心，不用那么频繁地备份了。另外，别忘了测试你的恢复计划！只有经过实践检验的恢复流程才能真正发挥作用。你可以定期模拟一些常见故障场景，看看你的系统是否能够顺利恢复到正常状态。 5. 代码示例为了让大家更好地理解，下面我会给出几个具体的代码示例，展示如何使用Saiku API来进行数据恢复操作。示例1：连接到Saiku服务器 java import org.saiku.service.datasource.IDatasourceService; import org.saiku.service.datasource.MondrianDatasource; public class SaikuConnectionExample { public static void main(String[] args) { // 假设我们已经有了一个名为"myDataSource"的数据源实例 MondrianDatasource myDataSource = new MondrianDatasource(); myDataSource.setName("myDataSource"); // 使用datasource服务保存数据源配置 IDatasourceService datasourceService = ...; // 获取datasource服务实例 datasourceService.save(myDataSource); } } 示例2：从备份文件中恢复数据这里假设你已经有一个包含所有必要信息的备份文件，比如SQL脚本。 java import java.io.BufferedReader; import java.io.FileReader; import java.sql.Connection; import java.sql.DriverManager; import java.sql.Statement; public class RestoreFromBackupExample { public static void main(String[] args) { try (Connection conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/mydb", "username", "password")) { Statement stmt = conn.createStatement(); // 读取备份文件内容并执行 BufferedReader reader = new BufferedReader(new FileReader("/path/to/backup/file.sql")); String line; StringBuilder sql = new StringBuilder(); while ((line = reader.readLine()) != null) { sql.append(line); if (line.trim().endsWith(";")) { stmt.execute(sql.toString()); sql.setLength(0); // 清空StringBuilder } } reader.close(); } catch (Exception e) { e.printStackTrace(); } } } 6. 结语好了，到这里我们的讨论就告一段落了。希望今天聊的这些能让大家更看重系统恢复计划，也赶紧动手做点啥来提高自己的数据安全，毕竟防患于未然嘛。记住，预防总是胜于治疗，提前做好准备总比事后补救要好得多！最后，如果你有任何想法或建议，欢迎随时与我交流。数据分析的世界充满了无限可能，让我们一起探索吧！ --- 以上就是本次关于“Saiku的系统恢复计划不充分”的全部内容。希望这篇文章能够对你有所帮助，也欢迎大家提出宝贵的意见和建议。

2024-11-18 15:31:47

寂静森林

Sqoop

Sqoop在数据迁移中因透明性不足导致作业失败的案例分析

...--connect jdbc:mysql://localhost:3306/mydatabase \ --username root \ --password mypassword \ --table employees \ --target-dir /user/hadoop/employees 这段代码看起来挺正常的，但我后来发现，当表中的数据量过大或者存在一些复杂的约束条件时，Sqoop就表现得不太友好。 --- 二、Sqoop作业失败的背后接下来，让我们一起深入探讨一下这个问题。说实话，刚开始接触Sqoop那会儿，我对它是怎么工作的压根儿没弄明白，稀里糊涂的。我以为只要配置好连接信息，然后指定源表和目标路径就行了。但实际上，Sqoop并不是这么简单的工具。当我第一次遇到作业失败的情况时，内心是崩溃的。屏幕上显示的错误信息密密麻麻，但仔细一看，其实都是些常见的问题。打个比方啊，Sqoop这家伙一碰到一些特别的符号，比如空格或者换行符，就容易“翻车”，直接给你整出点问题来。还有呢，有时候因为网络卡了一下，延迟太高，Sqoop就跟服务器说拜拜了，连接就这么断了，挺烦人的。有一次，我在尝试将一张包含大量JSON字段的表导出到HDFS时，Sqoop直接报错了。我当时就在心里嘀咕：“为啥别的工具处理起来轻轻松松的事儿，到Sqoop这儿就变得这么棘手呢？”后来，我一咬牙，开始翻遍各种资料，想着一定要找出个解决办法来。思考与尝试：经过一番研究，我发现Sqoop默认情况下并不会对数据进行深度解析，这意味着如果数据本身存在问题，Sqoop可能无法正确处理。所以，为了验证这个假设，我又做了一次测试。 bash sqoop import \ --connect jdbc:mysql://localhost:3306/mydatabase \ --username root \ --password mypassword \ --table problematic_table \ --fields-terminated-by '\t' \ --lines-terminated-by '\n' 这次我特意指定了分隔符和换行符，希望能避免之前遇到的那些麻烦。嘿，没想到这次作业居然被我搞定了！中间经历了不少波折，不过好在最后算是弄懂了个中奥秘，也算没白费功夫。 --- 三、透明性的重要性 Sqoop到底懂不懂我的需求？说到Sqoop的透明性，我觉得这是一个非常重要的概念。所谓的透明性嘛，简单来说，就是Sqoop能不能明白咱们的心思，然后老老实实地按咱们想的去干活儿，不添乱、不出错！显然，在我遇到的这些问题中，Sqoop的表现并不能让人满意。举个例子来说，假设你有一个包含多列的大表，其中某些列的数据类型比较复杂（例如数组、嵌套对象等）。在这种情况下，Sqoop可能会因为无法正确识别这些数据类型而失败。更糟糕的是，它并不会给出明确的提示，而是默默地报错，让你一头雾水。为了更好地应对这种情况，我在后续的工作中加入了更多的调试步骤。比如说啊，你可以先用describe这个命令去看看表的结构，确保所有的字段都乖乖地被正确识别了；接着呢，再用--check-column这个选项去瞅一眼，看看有没有重复的记录藏在里面。这样一来，虽然增加了工作量，但至少能减少不必要的麻烦。示例代码： bash sqoop job --create my_job \ -- import \ --connect jdbc:mysql://localhost:3306/mydatabase \ --username root \ --password mypassword \ --table employees \ --check-column id \ --incremental append \ --last-value 0 这段代码展示了如何创建一个增量作业，用于定期更新目标目录中的数据。通过这种方式，可以有效避免一次性加载过多数据带来的性能瓶颈。 --- 四、总结与展望与Sqoop共舞总的来说，尽管Sqoop在某些场景下表现得不尽人意，但它依然是一个强大的工具。通过不断学习和实践，我相信自己能够更加熟练地驾驭它。未来的计划里，我特别想试试一些更酷的功能，比如说用Sqoop直接搞出Avro文件，或者把Spark整进来做分布式计算，感觉会超级带劲！最后，我想说的是，技术这条路从来都不是一帆风顺的。遇到困难并不可怕，可怕的是我们因此放弃努力。正如那句话所说：“失败乃成功之母。”只要保持好奇心和求知欲，总有一天我们会找到属于自己的答案。如果你也有类似的经历，欢迎随时交流！我们一起进步，一起成长！ --- 希望这篇文章对你有所帮助，如果有任何疑问或者想要了解更多细节，请随时告诉我哦！

2025-03-22 15:39:31

风中飘零

转载文章

[转载]大数据IMF传奇行动绝密课程第104-114课：Spark Streaming电商广告点击综合案例

...放入到持久化系统中（MySQL）广告点击系统实时分析的意义：因为可以在线实时的看见广告的投放效果，就为广告的更大规模的投入和调整打下了坚实的基础，从而为公司带来最大化的经济回报。核心需求： 1、实时黑名单动态过滤出有效的用户广告点击行为：因为黑名单用户可能随时出现，所以需要动态更新； 2、在线计算广告点击流量； 3、Top3热门广告； 4、每个广告流量趋势； 5、广告点击用户的区域分布分析 6、最近一分钟的广告点击量； 7、整个广告点击Spark Streaming处理程序724小时运行；数据格式：时间、用户、广告、城市等技术细节：在线计算用户点击的次数分析，屏蔽IP等；使用updateStateByKey或者mapWithState进行不同地区广告点击排名的计算； Spark Streaming+Spark SQL+Spark Core等综合分析数据；使用Window类型的操作；高可用和性能调优等等；流量趋势，一般会结合DB等； Spark Core / /package com.tom.spark.SparkApps.sparkstreaming;import java.util.Date;import java.util.HashMap;import java.util.Map;import java.util.Properties;import java.util.Random;import kafka.javaapi.producer.Producer;import kafka.producer.KeyedMessage;import kafka.producer.ProducerConfig;/ 数据生成代码，Kafka Producer产生数据/public class MockAdClickedStat {/ @param args/public static void main(String[] args) {final Random random = new Random();final String[] provinces = new String[]{"Guangdong", "Zhejiang", "Jiangsu", "Fujian"};final Map<String, String[]> cities = new HashMap<String, String[]>();cities.put("Guangdong", new String[]{"Guangzhou", "Shenzhen", "Dongguan"});cities.put("Zhejiang", new String[]{"Hangzhou", "Wenzhou", "Ningbo"});cities.put("Jiangsu", new String[]{"Nanjing", "Suzhou", "Wuxi"});cities.put("Fujian", new String[]{"Fuzhou", "Xiamen", "Sanming"});final String[] ips = new String[] {"192.168.112.240","192.168.112.239","192.168.112.245","192.168.112.246","192.168.112.247","192.168.112.248","192.168.112.249","192.168.112.250","192.168.112.251","192.168.112.252","192.168.112.253","192.168.112.254",};/ Kafka相关的基本配置信息/Properties kafkaConf = new Properties();kafkaConf.put("serializer.class", "kafka.serializer.StringEncoder");kafkaConf.put("metadeta.broker.list", "Master:9092,Worker1:9092,Worker2:9092");ProducerConfig producerConfig = new ProducerConfig(kafkaConf);final Producer<Integer, String> producer = new Producer<Integer, String>(producerConfig);new Thread(new Runnable() {public void run() {while(true) {//在线处理广告点击流的基本数据格式：timestamp、ip、userID、adID、province、cityLong timestamp = new Date().getTime();String ip = ips[random.nextInt(12)]; //可以采用网络上免费提供的ip库int userID = random.nextInt(10000);int adID = random.nextInt(100);String province = provinces[random.nextInt(4)];String city = cities.get(province)[random.nextInt(3)];String clickedAd = timestamp + "\t" + ip + "\t" + userID + "\t" + adID + "\t" + province + "\t" + city;producer.send(new KeyedMessage<Integer, String>("AdClicked", clickedAd));try {Thread.sleep(50);} catch (InterruptedException e) {// TODO Auto-generated catch blocke.printStackTrace();} }} }).start();} } package com.tom.spark.SparkApps.sparkstreaming;import java.sql.Connection;import java.sql.DriverManager;import java.sql.PreparedStatement;import java.sql.ResultSet;import java.sql.SQLException;import java.util.ArrayList;import java.util.Arrays;import java.util.HashMap;import java.util.HashSet;import java.util.Iterator;import java.util.List;import java.util.Map;import java.util.Set;import java.util.concurrent.LinkedBlockingQueue;import kafka.serializer.StringDecoder;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.api.java.JavaSparkContext;import org.apache.spark.api.java.function.Function;import org.apache.spark.api.java.function.Function2;import org.apache.spark.api.java.function.PairFunction;import org.apache.spark.api.java.function.VoidFunction;import org.apache.spark.sql.DataFrame;import org.apache.spark.sql.Row;import org.apache.spark.sql.RowFactory;import org.apache.spark.sql.hive.HiveContext;import org.apache.spark.sql.types.DataTypes;import org.apache.spark.sql.types.StructType;import org.apache.spark.streaming.Durations;import org.apache.spark.streaming.api.java.JavaDStream;import org.apache.spark.streaming.api.java.JavaPairDStream;import org.apache.spark.streaming.api.java.JavaPairInputDStream;import org.apache.spark.streaming.api.java.JavaStreamingContext;import org.apache.spark.streaming.api.java.JavaStreamingContextFactory;import org.apache.spark.streaming.kafka.KafkaUtils;import com.google.common.base.Optional;import scala.Tuple2;/ 数据处理，Kafka消费者/public class AdClickedStreamingStats {/ @param args/public static void main(String[] args) {// TODO Auto-generated method stub//好处：1、checkpoint 2、工厂final SparkConf conf = new SparkConf().setAppName("SparkStreamingOnKafkaDirect").setMaster("hdfs://Master:7077/");final String checkpointDirectory = "hdfs://Master:9000/library/SparkStreaming/CheckPoint_Data";JavaStreamingContextFactory factory = new JavaStreamingContextFactory() {public JavaStreamingContext create() {// TODO Auto-generated method stubreturn createContext(checkpointDirectory, conf);} };/ 可以从失败中恢复Driver，不过还需要指定Driver这个进程运行在Cluster，并且在提交应用程序的时候制定--supervise;/JavaStreamingContext javassc = JavaStreamingContext.getOrCreate(checkpointDirectory, factory);/ 第三步：创建Spark Streaming输入数据来源input Stream: 1、数据输入来源可以基于File、HDFS、Flume、Kafka、Socket等 2、在这里我们指定数据来源于网络Socket端口，Spark Streaming连接上该端口并在运行的时候一直监听该端口的数据 (当然该端口服务首先必须存在），并且在后续会根据业务需要不断有数据产生（当然对于Spark Streaming 应用程序的运行而言，有无数据其处理流程都是一样的） 3、如果经常在每间隔5秒钟没有数据的话不断启动空的Job其实会造成调度资源的浪费，因为并没有数据需要发生计算；所以实际的企业级生成环境的代码在具体提交Job前会判断是否有数据，如果没有的话就不再提交Job；///创建Kafka元数据来让Spark Streaming这个Kafka Consumer利用Map<String, String> kafkaParameters = new HashMap<String, String>();kafkaParameters.put("metadata.broker.list", "Master:9092,Worker1:9092,Worker2:9092");Set<String> topics = new HashSet<String>();topics.add("SparkStreamingDirected");JavaPairInputDStream<String, String> adClickedStreaming = KafkaUtils.createDirectStream(javassc, String.class, String.class, StringDecoder.class, StringDecoder.class,kafkaParameters, topics);/因为要对黑名单进行过滤，而数据是在RDD中的，所以必然使用transform这个函数；但是在这里我们必须使用transformToPair，原因是读取进来的Kafka的数据是Pair<String,String>类型, 另一个原因是过滤后的数据要进行进一步处理，所以必须是读进的Kafka数据的原始类型在此再次说明，每个Batch Duration中实际上讲输入的数据就是被一个且仅被一个RDD封装的，你可以有多个 InputDStream，但其实在产生job的时候，这些不同的InputDStream在Batch Duration中就相当于Spark基于HDFS 数据操作的不同文件来源而已罢了。/JavaPairDStream<String, String> filteredadClickedStreaming = adClickedStreaming.transformToPair(new Function<JavaPairRDD<String,String>, JavaPairRDD<String,String>>() {public JavaPairRDD<String, String> call(JavaPairRDD<String, String> rdd) throws Exception {/ 在线黑名单过滤思路步骤： 1、从数据库中获取黑名单转换成RDD，即新的RDD实例封装黑名单数据； 2、然后把代表黑名单的RDD的实例和Batch Duration产生的RDD进行Join操作，准确的说是进行leftOuterJoin操作，也就是说使用Batch Duration产生的RDD和代表黑名单的RDD实例进行 leftOuterJoin操作，如果两者都有内容的话，就会是true，否则的话就是false 我们要留下的是leftOuterJoin结果为false； /final List<String> blackListNames = new ArrayList<String>();JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();jdbcWrapper.doQuery("SELECT FROM blacklisttable", null, new ExecuteCallBack() {public void resultCallBack(ResultSet result) throws Exception {while(result.next()){blackListNames.add(result.getString(1));} }});List<Tuple2<String, Boolean>> blackListTuple = new ArrayList<Tuple2<String,Boolean>>();for(String name : blackListNames) {blackListTuple.add(new Tuple2<String, Boolean>(name, true));}List<Tuple2<String, Boolean>> blacklistFromListDB = blackListTuple; //数据来自于查询的黑名单表并且映射成为<String, Boolean>JavaSparkContext jsc = new JavaSparkContext(rdd.context());/ 黑名单的表中只有userID，但是如果要进行join操作的话就必须是Key-Value，所以在这里我们需要基于数据表中的数据产生Key-Value类型的数据集合/JavaPairRDD<String, Boolean> blackListRDD = jsc.parallelizePairs(blacklistFromListDB);/ 进行操作的时候肯定是基于userID进行join，所以必须把传入的rdd进行mapToPair操作转化成为符合格式的RDD/JavaPairRDD<String, Tuple2<String, String>> rdd2Pair = rdd.mapToPair(new PairFunction<Tuple2<String,String>, String, Tuple2<String, String>>() {public Tuple2<String, Tuple2<String, String>> call(Tuple2<String, String> t) throws Exception {// TODO Auto-generated method stubString userID = t._2.split("\t")[2];return new Tuple2<String, Tuple2<String,String>>(userID, t);} });JavaPairRDD<String, Tuple2<Tuple2<String, String>, Optional<Boolean>>> joined = rdd2Pair.leftOuterJoin(blackListRDD);JavaPairRDD<String, String> result = joined.filter(new Function<Tuple2<String,Tuple2<Tuple2<String,String>,Optional<Boolean>>>, Boolean>() {public Boolean call(Tuple2<String, Tuple2<Tuple2<String, String>, Optional<Boolean>>> tuple)throws Exception {// TODO Auto-generated method stubOptional<Boolean> optional = tuple._2._2;if(optional.isPresent() && optional.get()){return false;} else {return true;} }}).mapToPair(new PairFunction<Tuple2<String,Tuple2<Tuple2<String,String>,Optional<Boolean>>>, String, String>() {public Tuple2<String, String> call(Tuple2<String, Tuple2<Tuple2<String, String>, Optional<Boolean>>> t)throws Exception {// TODO Auto-generated method stubreturn t._2._1;} });return result;} });//广告点击的基本数据格式：timestamp、ip、userID、adID、province、cityJavaPairDStream<String, Long> pairs = filteredadClickedStreaming.mapToPair(new PairFunction<Tuple2<String,String>, String, Long>() {public Tuple2<String, Long> call(Tuple2<String, String> t) throws Exception {String[] splited=t._2.split("\t");String timestamp = splited[0]; //YYYY-MM-DDString ip = splited[1];String userID = splited[2];String adID = splited[3];String province = splited[4];String city = splited[5]; String clickedRecord = timestamp + "_" +ip + "_"+userID+"_"+adID+"_"+province +"_"+city;return new Tuple2<String, Long>(clickedRecord, 1L);} });/ 第4.3步：在单词实例计数为1基础上，统计每个单词在文件中出现的总次数/JavaPairDStream<String, Long> adClickedUsers= pairs.reduceByKey(new Function2<Long, Long, Long>() {public Long call(Long i1, Long i2) throws Exception{return i1 + i2;} });/判断有效的点击，复杂化的采用机器学习训练模型进行在线过滤简单的根据ip判断1天不超过100次；也可以通过一个batch duration的点击次数判断是否非法广告点击，通过一个batch来判断是不完整的，还需要一天的数据也可以每一个小时来判断。/JavaPairDStream<String, Long> filterClickedBatch = adClickedUsers.filter(new Function<Tuple2<String,Long>, Boolean>() {public Boolean call(Tuple2<String, Long> v1) throws Exception {if (1 < v1._2){//更新一些黑名单的数据库表return false;} else { return true;} }});//filterClickedBatch.print();//写入数据库filterClickedBatch.foreachRDD(new Function<JavaPairRDD<String,Long>, Void>() {public Void call(JavaPairRDD<String, Long> rdd) throws Exception {rdd.foreachPartition(new VoidFunction<Iterator<Tuple2<String,Long>>>() {public void call(Iterator<Tuple2<String, Long>> partition) throws Exception {//使用数据库连接池的高效读写数据库的方式将数据写入数据库mysql//例如一次插入 1000条 records，使用insertBatch 或 updateBatch//插入的用户数据信息：userID,adID,clickedCount,time//这里面有一个问题，可能出现两条记录的key是一样的，此时需要更新累加操作List<UserAdClicked> userAdClickedList = new ArrayList<UserAdClicked>();while(partition.hasNext()) {Tuple2<String, Long> record = partition.next();String[] splited = record._1.split("\t");UserAdClicked userClicked = new UserAdClicked();userClicked.setTimestamp(splited[0]);userClicked.setIp(splited[1]);userClicked.setUserID(splited[2]);userClicked.setAdID(splited[3]);userClicked.setProvince(splited[4]);userClicked.setCity(splited[5]);userAdClickedList.add(userClicked);}final List<UserAdClicked> inserting = new ArrayList<UserAdClicked>();final List<UserAdClicked> updating = new ArrayList<UserAdClicked>();JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();//表的字段timestamp、ip、userID、adID、province、city、clickedCountfor(final UserAdClicked clicked : userAdClickedList) {jdbcWrapper.doQuery("SELECT clickedCount FROM adclicked WHERE"+ " timestamp =? AND userID = ? AND adID = ?",new Object[]{clicked.getTimestamp(), clicked.getUserID(),clicked.getAdID()}, new ExecuteCallBack() {public void resultCallBack(ResultSet result) throws Exception {// TODO Auto-generated method stubif(result.next()) {long count = result.getLong(1);clicked.setClickedCount(count);updating.add(clicked);} else {inserting.add(clicked);clicked.setClickedCount(1L);} }});}//表的字段timestamp、ip、userID、adID、province、city、clickedCountList<Object[]> insertParametersList = new ArrayList<Object[]>();for(UserAdClicked insertRecord : inserting) {insertParametersList.add(new Object[] {insertRecord.getTimestamp(),insertRecord.getIp(),insertRecord.getUserID(),insertRecord.getAdID(),insertRecord.getProvince(),insertRecord.getCity(),insertRecord.getClickedCount()});}jdbcWrapper.doBatch("INSERT INTO adclicked VALUES(?, ?, ?, ?, ?, ?, ?)", insertParametersList);//表的字段timestamp、ip、userID、adID、province、city、clickedCountList<Object[]> updateParametersList = new ArrayList<Object[]>();for(UserAdClicked updateRecord : updating) {updateParametersList.add(new Object[] {updateRecord.getTimestamp(),updateRecord.getIp(),updateRecord.getUserID(),updateRecord.getAdID(),updateRecord.getProvince(),updateRecord.getCity(),updateRecord.getClickedCount() + 1});}jdbcWrapper.doBatch("UPDATE adclicked SET clickedCount = ? WHERE"+ " timestamp =? AND ip = ? AND userID = ? AND adID = ? "+ "AND province = ? AND city = ?", updateParametersList);} });return null;} });//再次过滤，从数据库中读取数据过滤黑名单JavaPairDStream<String, Long> blackListBasedOnHistory = filterClickedBatch.filter(new Function<Tuple2<String,Long>, Boolean>() {public Boolean call(Tuple2<String, Long> v1) throws Exception {//广告点击的基本数据格式：timestamp,ip,userID,adID,province,cityString[] splited = v1._1.split("\t"); //提取key值String date =splited[0];String userID =splited[2];String adID =splited[3];//查询一下数据库同一个用户同一个广告id点击量超过50次列入黑名单//接下来根据date、userID、adID条件去查询用户点击广告的数据表，获得总的点击次数//这个时候基于点击次数判断是否属于黑名单点击int clickedCountTotalToday = 81 ;if (clickedCountTotalToday > 50) {return true;}else {return false ;} }});//map操作，找出用户的idJavaDStream<String> blackListuserIDBasedInBatchOnhistroy =blackListBasedOnHistory.map(new Function<Tuple2<String,Long>, String>() {public String call(Tuple2<String, Long> v1) throws Exception {// TODO Auto-generated method stubreturn v1._1.split("\t")[2];} });//有一个问题，数据可能重复，在一个partition里面重复，这个好办；//但多个partition不能保证一个用户重复，需要对黑名单的整个rdd进行去重操作。//rdd去重了，partition也就去重了，一石二鸟，一箭双雕// 找出了黑名单，下一步就写入黑名单数据库表中JavaDStream<String> blackListUniqueuserBasedInBatchOnhistroy = blackListuserIDBasedInBatchOnhistroy.transform(new Function<JavaRDD<String>, JavaRDD<String>>() {public JavaRDD<String> call(JavaRDD<String> rdd) throws Exception {// TODO Auto-generated method stubreturn rdd.distinct();} });// 下一步写入到数据表中blackListUniqueuserBasedInBatchOnhistroy.foreachRDD(new Function<JavaRDD<String>, Void>() {public Void call(JavaRDD<String> rdd) throws Exception {rdd.foreachPartition(new VoidFunction<Iterator<String>>() {public void call(Iterator<String> t) throws Exception {// TODO Auto-generated method stub//插入的用户信息可以只包含：useID//此时直接插入黑名单数据表即可。//写入数据库List<Object[]> blackList = new ArrayList<Object[]>();while(t.hasNext()) {blackList.add(new Object[]{t.next()});}JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();jdbcWrapper.doBatch("INSERT INTO blacklisttable values (?)", blackList);} });return null;} });/广告点击累计动态更新,每个updateStateByKey都会在Batch Duration的时间间隔的基础上进行广告点击次数的更新，更新之后我们一般都会持久化到外部存储设备上，在这里我们存储到MySQL数据库中/JavaPairDStream<String, Long> updateStateByKeyDSteam = filteredadClickedStreaming.mapToPair(new PairFunction<Tuple2<String,String>, String, Long>() {public Tuple2<String, Long> call(Tuple2<String, String> t)throws Exception {String[] splited=t._2.split("\t");String timestamp = splited[0]; //YYYY-MM-DDString ip = splited[1];String userID = splited[2];String adID = splited[3];String province = splited[4];String city = splited[5]; String clickedRecord = timestamp + "_" +ip + "_"+userID+"_"+adID+"_"+province +"_"+city;return new Tuple2<String, Long>(clickedRecord, 1L);} }).updateStateByKey(new Function2<List<Long>, Optional<Long>, Optional<Long>>() {public Optional<Long> call(List<Long> v1, Optional<Long> v2)throws Exception {// v1:当前的Key在当前的Batch Duration中出现的次数的集合，例如{1，1，1，。。。，1}// v2:当前的Key在以前的Batch Duration中积累下来的结果；Long clickedTotalHistory = 0L; if(v2.isPresent()){clickedTotalHistory = v2.get();}for(Long one : v1) {clickedTotalHistory += one;}return Optional.of(clickedTotalHistory);} });updateStateByKeyDSteam.foreachRDD(new Function<JavaPairRDD<String,Long>, Void>() {public Void call(JavaPairRDD<String, Long> rdd) throws Exception {rdd.foreachPartition(new VoidFunction<Iterator<Tuple2<String,Long>>>() {public void call(Iterator<Tuple2<String, Long>> partition) throws Exception {//使用数据库连接池的高效读写数据库的方式将数据写入数据库mysql//例如一次插入 1000条 records，使用insertBatch 或 updateBatch//插入的用户数据信息：timestamp、adID、province、city//这里面有一个问题，可能出现两条记录的key是一样的，此时需要更新累加操作List<AdClicked> AdClickedList = new ArrayList<AdClicked>();while(partition.hasNext()) {Tuple2<String, Long> record = partition.next();String[] splited = record._1.split("\t");AdClicked adClicked = new AdClicked();adClicked.setTimestamp(splited[0]);adClicked.setAdID(splited[1]);adClicked.setProvince(splited[2]);adClicked.setCity(splited[3]);adClicked.setClickedCount(record._2);AdClickedList.add(adClicked);}final List<AdClicked> inserting = new ArrayList<AdClicked>();final List<AdClicked> updating = new ArrayList<AdClicked>();JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();//表的字段timestamp、ip、userID、adID、province、city、clickedCountfor(final AdClicked clicked : AdClickedList) {jdbcWrapper.doQuery("SELECT clickedCount FROM adclickedcount WHERE"+ " timestamp = ? AND adID = ? AND province = ? AND city = ?",new Object[]{clicked.getTimestamp(), clicked.getAdID(),clicked.getProvince(), clicked.getCity()}, new ExecuteCallBack() {public void resultCallBack(ResultSet result) throws Exception {// TODO Auto-generated method stubif(result.next()) {long count = result.getLong(1);clicked.setClickedCount(count);updating.add(clicked);} else {inserting.add(clicked);clicked.setClickedCount(1L);} }});}//表的字段timestamp、ip、userID、adID、province、city、clickedCountList<Object[]> insertParametersList = new ArrayList<Object[]>();for(AdClicked insertRecord : inserting) {insertParametersList.add(new Object[] {insertRecord.getTimestamp(),insertRecord.getAdID(),insertRecord.getProvince(),insertRecord.getCity(),insertRecord.getClickedCount()});}jdbcWrapper.doBatch("INSERT INTO adclickedcount VALUES(?, ?, ?, ?, ?)", insertParametersList);//表的字段timestamp、ip、userID、adID、province、city、clickedCountList<Object[]> updateParametersList = new ArrayList<Object[]>();for(AdClicked updateRecord : updating) {updateParametersList.add(new Object[] {updateRecord.getClickedCount(),updateRecord.getTimestamp(),updateRecord.getAdID(),updateRecord.getProvince(),updateRecord.getCity()});}jdbcWrapper.doBatch("UPDATE adclickedcount SET clickedCount = ? WHERE"+ " timestamp =? AND adID = ? AND province = ? AND city = ?", updateParametersList);} });return null;} });/ 对广告点击进行TopN计算，计算出每天每个省份Top5排名的广告因为我们直接对RDD进行操作，所以使用了transfomr算子；/updateStateByKeyDSteam.transform(new Function<JavaPairRDD<String,Long>, JavaRDD<Row>>() {public JavaRDD<Row> call(JavaPairRDD<String, Long> rdd) throws Exception {JavaRDD<Row> rowRDD = rdd.mapToPair(new PairFunction<Tuple2<String,Long>, String, Long>() {public Tuple2<String, Long> call(Tuple2<String, Long> t)throws Exception {// TODO Auto-generated method stubString[] splited=t._1.split("_");String timestamp = splited[0]; //YYYY-MM-DDString adID = splited[3];String province = splited[4];String clickedRecord = timestamp + "_" + adID + "_" + province;return new Tuple2<String, Long>(clickedRecord, t._2);} }).reduceByKey(new Function2<Long, Long, Long>() {public Long call(Long v1, Long v2) throws Exception {// TODO Auto-generated method stubreturn v1 + v2;} }).map(new Function<Tuple2<String,Long>, Row>() {public Row call(Tuple2<String, Long> v1) throws Exception {// TODO Auto-generated method stubString[] splited=v1._1.split("_");String timestamp = splited[0]; //YYYY-MM-DDString adID = splited[3];String province = splited[4];return RowFactory.create(timestamp, adID, province, v1._2);} });StructType structType = DataTypes.createStructType(Arrays.asList(DataTypes.createStructField("timestamp", DataTypes.StringType, true),DataTypes.createStructField("adID", DataTypes.StringType, true),DataTypes.createStructField("province", DataTypes.StringType, true),DataTypes.createStructField("clickedCount", DataTypes.LongType, true)));HiveContext hiveContext = new HiveContext(rdd.context());DataFrame df = hiveContext.createDataFrame(rowRDD, structType);df.registerTempTable("topNTableSource");DataFrame result = hiveContext.sql("SELECT timestamp, adID, province, clickedCount, FROM"+ " (SELECT timestamp, adID, province,clickedCount, "+ "ROW_NUMBER() OVER(PARTITION BY province ORDER BY clickeCount DESC) rank "+ "FROM topNTableSource) subquery "+ "WHERE rank <= 5");return result.toJavaRDD();} }).foreachRDD(new Function<JavaRDD<Row>, Void>() {public Void call(JavaRDD<Row> rdd) throws Exception {// TODO Auto-generated method stubrdd.foreachPartition(new VoidFunction<Iterator<Row>>() {public void call(Iterator<Row> t) throws Exception {// TODO Auto-generated method stubList<AdProvinceTopN> adProvinceTopN = new ArrayList<AdProvinceTopN>();while(t.hasNext()) {Row row = t.next();AdProvinceTopN item = new AdProvinceTopN();item.setTimestamp(row.getString(0));item.setAdID(row.getString(1));item.setProvince(row.getString(2));item.setClickedCount(row.getLong(3));adProvinceTopN.add(item);}// final List<AdProvinceTopN> inserting = new ArrayList<AdProvinceTopN>();// final List<AdProvinceTopN> updating = new ArrayList<AdProvinceTopN>();JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();Set<String> set = new HashSet<String>();for(AdProvinceTopN item: adProvinceTopN){set.add(item.getTimestamp() + "_" + item.getProvince());}//表的字段timestamp、adID、province、clickedCountArrayList<Object[]> deleteParametersList = new ArrayList<Object[]>();for(String deleteRecord : set) {String[] splited = deleteRecord.split("_");deleteParametersList.add(new Object[]{splited[0],splited[1]});}jdbcWrapper.doBatch("DELETE FROM adprovincetopn WHERE timestamp = ? AND province = ?", deleteParametersList);//表的字段timestamp、ip、userID、adID、province、city、clickedCountList<Object[]> insertParametersList = new ArrayList<Object[]>();for(AdProvinceTopN insertRecord : adProvinceTopN) {insertParametersList.add(new Object[] {insertRecord.getClickedCount(),insertRecord.getTimestamp(),insertRecord.getAdID(),insertRecord.getProvince()});}jdbcWrapper.doBatch("INSERT INTO adprovincetopn VALUES (?, ?, ?, ?)", insertParametersList);} });return null;} });/ 计算过去半个小时内广告点击的趋势广告点击的基本数据格式：timestamp、ip、userID、adID、province、city/filteredadClickedStreaming.mapToPair(new PairFunction<Tuple2<String,String>, String, Long>() {public Tuple2<String, Long> call(Tuple2<String, String> t)throws Exception {String splited[] = t._2.split("\t");String adID = splited[3];String time = splited[0]; //Todo:后续需要重构代码实现时间戳和分钟的转换提取。此处需要提取出该广告的点击分钟单位return new Tuple2<String, Long>(time + "_" + adID, 1L);} }).reduceByKeyAndWindow(new Function2<Long, Long, Long>() {public Long call(Long v1, Long v2) throws Exception {// TODO Auto-generated method stubreturn v1 + v2;} }, new Function2<Long, Long, Long>() {public Long call(Long v1, Long v2) throws Exception {// TODO Auto-generated method stubreturn v1 - v2;} }, Durations.minutes(30), Durations.milliseconds(5)).foreachRDD(new Function<JavaPairRDD<String,Long>, Void>() {public Void call(JavaPairRDD<String, Long> rdd) throws Exception {// TODO Auto-generated method stubrdd.foreachPartition(new VoidFunction<Iterator<Tuple2<String,Long>>>() {public void call(Iterator<Tuple2<String, Long>> partition)throws Exception {List<AdTrendStat> adTrend = new ArrayList<AdTrendStat>();// TODO Auto-generated method stubwhile(partition.hasNext()) {Tuple2<String, Long> record = partition.next();String[] splited = record._1.split("_");String time = splited[0];String adID = splited[1];Long clickedCount = record._2;/ 在插入数据到数据库的时候具体需要哪些字段？time、adID、clickedCount; 而我们通过J2EE技术进行趋势绘图的时候肯定是需要年、月、日、时、分这个维度的，所以我们在这里需要年月日、小时、分钟这些时间维度；/AdTrendStat adTrendStat = new AdTrendStat();adTrendStat.setAdID(adID);adTrendStat.setClickedCount(clickedCount);adTrendStat.set_date(time); //Todo:获取年月日adTrendStat.set_hour(time); //Todo:获取小时adTrendStat.set_minute(time);//Todo:获取分钟adTrend.add(adTrendStat);}final List<AdTrendStat> inserting = new ArrayList<AdTrendStat>();final List<AdTrendStat> updating = new ArrayList<AdTrendStat>();JDBCWrapper jdbcWrapper = JDBCWrapper.getJDBCInstance();//表的字段timestamp、ip、userID、adID、province、city、clickedCountfor(final AdTrendStat trend : adTrend) {final AdTrendCountHistory adTrendhistory = new AdTrendCountHistory();jdbcWrapper.doQuery("SELECT clickedCount FROM adclickedtrend WHERE"+ " date =? AND hour = ? AND minute = ? AND AdID = ?",new Object[]{trend.get_date(), trend.get_hour(), trend.get_minute(),trend.getAdID()}, new ExecuteCallBack() {public void resultCallBack(ResultSet result) throws Exception {// TODO Auto-generated method stubif(result.next()) {long count = result.getLong(1);adTrendhistory.setClickedCountHistoryLong(count);updating.add(trend);} else { inserting.add(trend);} }});}//表的字段date、hour、minute、adID、clickedCountList<Object[]> insertParametersList = new ArrayList<Object[]>();for(AdTrendStat insertRecord : inserting) {insertParametersList.add(new Object[] {insertRecord.get_date(),insertRecord.get_hour(),insertRecord.get_minute(),insertRecord.getAdID(),insertRecord.getClickedCount()});}jdbcWrapper.doBatch("INSERT INTO adclickedtrend VALUES(?, ?, ?, ?, ?)", insertParametersList);//表的字段date、hour、minute、adID、clickedCountList<Object[]> updateParametersList = new ArrayList<Object[]>();for(AdTrendStat updateRecord : updating) {updateParametersList.add(new Object[] {updateRecord.getClickedCount(),updateRecord.get_date(),updateRecord.get_hour(),updateRecord.get_minute(),updateRecord.getAdID()});}jdbcWrapper.doBatch("UPDATE adclickedtrend SET clickedCount = ? WHERE"+ " date =? AND hour = ? AND minute = ? AND AdID = ?", updateParametersList);} });return null;} });;/ Spark Streaming 执行引擎也就是Driver开始运行，Driver启动的时候是位于一条新的线程中的，当然其内部有消息循环体，用于接收应用程序本身或者Executor中的消息，/javassc.start();javassc.awaitTermination();javassc.close();}private static JavaStreamingContext createContext(String checkpointDirectory, SparkConf conf) {// If you do not see this printed, that means the StreamingContext has been loaded// from the new checkpointSystem.out.println("Creating new context");// Create the context with a 5 second batch sizeJavaStreamingContext ssc = new JavaStreamingContext(conf, Durations.seconds(10));ssc.checkpoint(checkpointDirectory);return ssc;} }class JDBCWrapper {private static JDBCWrapper jdbcInstance = null;private static LinkedBlockingQueue<Connection> dbConnectionPool = new LinkedBlockingQueue<Connection>();static {try {Class.forName("com.mysql.jdbc.Driver");} catch (ClassNotFoundException e) {// TODO Auto-generated catch blocke.printStackTrace();} }public static JDBCWrapper getJDBCInstance() {if(jdbcInstance == null) {synchronized (JDBCWrapper.class) {if(jdbcInstance == null) {jdbcInstance = new JDBCWrapper();} }}return jdbcInstance; }private JDBCWrapper() {for(int i = 0; i < 10; i++){try {Connection conn = DriverManager.getConnection("jdbc:mysql://Master:3306/sparkstreaming","root", "root");dbConnectionPool.put(conn);} catch (Exception e) {// TODO Auto-generated catch blocke.printStackTrace();} } }public synchronized Connection getConnection() {while(0 == dbConnectionPool.size()){try {Thread.sleep(20);} catch (InterruptedException e) {// TODO Auto-generated catch blocke.printStackTrace();} }return dbConnectionPool.poll();}public int[] doBatch(String sqlText, List<Object[]> paramsList){Connection conn = getConnection();PreparedStatement preparedStatement = null;int[] result = null;try {conn.setAutoCommit(false);preparedStatement = conn.prepareStatement(sqlText);for(Object[] parameters: paramsList) {for(int i = 0; i < parameters.length; i++){preparedStatement.setObject(i + 1, parameters[i]);} preparedStatement.addBatch();}result = preparedStatement.executeBatch();conn.commit();} catch (SQLException e) {// TODO Auto-generated catch blocke.printStackTrace();} finally {if(preparedStatement != null) {try {preparedStatement.close();} catch (SQLException e) {// TODO Auto-generated catch blocke.printStackTrace();} }if(conn != null) {try {dbConnectionPool.put(conn);} catch (InterruptedException e) {// TODO Auto-generated catch blocke.printStackTrace();} }}return result; }public void doQuery(String sqlText, Object[] paramsList, ExecuteCallBack callback){Connection conn = getConnection();PreparedStatement preparedStatement = null;ResultSet result = null;try {preparedStatement = conn.prepareStatement(sqlText);for(int i = 0; i < paramsList.length; i++){preparedStatement.setObject(i + 1, paramsList[i]);} result = preparedStatement.executeQuery();try {callback.resultCallBack(result);} catch (Exception e) {// TODO Auto-generated catch blocke.printStackTrace();} } catch (SQLException e) {// TODO Auto-generated catch blocke.printStackTrace();} finally {if(preparedStatement != null) {try {preparedStatement.close();} catch (SQLException e) {// TODO Auto-generated catch blocke.printStackTrace();} }if(conn != null) {try {dbConnectionPool.put(conn);} catch (InterruptedException e) {// TODO Auto-generated catch blocke.printStackTrace();} }} }}interface ExecuteCallBack {void resultCallBack(ResultSet result) throws Exception;}class UserAdClicked {private String timestamp;private String ip;private String userID;private String adID;private String province;private String city;private Long clickedCount;public String getTimestamp() {return timestamp;}public void setTimestamp(String timestamp) {this.timestamp = timestamp;}public String getIp() {return ip;}public void setIp(String ip) {this.ip = ip;}public String getUserID() {return userID;}public void setUserID(String userID) {this.userID = userID;}public String getAdID() {return adID;}public void setAdID(String adID) {this.adID = adID;}public String getProvince() {return province;}public void setProvince(String province) {this.province = province;}public String getCity() {return city;}public void setCity(String city) {this.city = city;}public Long getClickedCount() {return clickedCount;}public void setClickedCount(Long clickedCount) {this.clickedCount = clickedCount;} }class AdClicked {private String timestamp;private String adID;private String province;private String city;private Long clickedCount;public String getTimestamp() {return timestamp;}public void setTimestamp(String timestamp) {this.timestamp = timestamp;}public String getAdID() {return adID;}public void setAdID(String adID) {this.adID = adID;}public String getProvince() {return province;}public void setProvince(String province) {this.province = province;}public String getCity() {return city;}public void setCity(String city) {this.city = city;}public Long getClickedCount() {return clickedCount;}public void setClickedCount(Long clickedCount) {this.clickedCount = clickedCount;} }class AdProvinceTopN {private String timestamp;private String adID;private String province;private Long clickedCount;public String getTimestamp() {return timestamp;}public void setTimestamp(String timestamp) {this.timestamp = timestamp;}public String getAdID() {return adID;}public void setAdID(String adID) {this.adID = adID;}public String getProvince() {return province;}public void setProvince(String province) {this.province = province;}public Long getClickedCount() {return clickedCount;}public void setClickedCount(Long clickedCount) {this.clickedCount = clickedCount;} }class AdTrendStat {private String _date;private String _hour;private String _minute;private String adID;private Long clickedCount;public String get_date() {return _date;}public void set_date(String _date) {this._date = _date;}public String get_hour() {return _hour;}public void set_hour(String _hour) {this._hour = _hour;}public String get_minute() {return _minute;}public void set_minute(String _minute) {this._minute = _minute;}public String getAdID() {return adID;}public void setAdID(String adID) {this.adID = adID;}public Long getClickedCount() {return clickedCount;}public void setClickedCount(Long clickedCount) {this.clickedCount = clickedCount;} }class AdTrendCountHistory{private Long clickedCountHistoryLong;public Long getClickedCountHistoryLong() {return clickedCountHistoryLong;}public void setClickedCountHistoryLong(Long clickedCountHistoryLong) {this.clickedCountHistoryLong = clickedCountHistoryLong;} } 本篇文章为转载内容。原文链接：https://blog.csdn.net/tom_8899_li/article/details/71194434。该文由互联网用户投稿提供，文中观点代表作者本人意见，并不代表本站的立场。作为信息平台，本站仅提供文章转载服务，并不拥有其所有权，也不对文章内容的真实性、准确性和合法性承担责任。如发现本文存在侵权、违法、违规或事实不符的情况，请及时联系我们，我们将第一时间进行核实并删除相应内容。

2023-02-14 19:16:35

297

转载

转载文章

[转载]基于activemq的分布式事务解决方案

...道，一旦有一个数据库连接失败，另一个数据库的操作就不会进行，一个数据库操作失败就会导致另一个数据库回滚，只有他们全部成功两个数据库的事务才会提交。基于XA协议的两段和三段提交是一种严格的安全确认机制，其安全性是非常高的，但是保证安全性的前提是牺牲了性能，这个就是分布式系统里面的CAP理论，做任何架构的前提需要有取舍。所以基于XA协议的分布式事务并发性不高，不适合高并发场景。 2、基于activemq的解决方案如图： 1、支付宝扣款成功时往message表插入消息 2、message表有message_id(流水id，标识夸系统的一次转账操作),status（confirm，unconfirm） 3、timer扫描message表的unconfirm状态记录往activemq插入消息 4、余额宝收到消息消费消息时先查询message表如果有记录就不处理如果没记录就进行数据库增款操作 5、如果余额宝数据库操作成功往余额宝message表插入消息，表字段跟支付宝message一致 6、如果5操作成功，回调支付宝接口修改message表状态，把unconfirm状态转换成confirm状态问题描述： 1、支付宝设计message表的目的如果支付宝往activemq插入消息而余额宝消费消息异常，有可能是消费消息成功而事务操作异常，有可能是网络异常等等不确定因素。如果出现异常而activemq收到了确认消息的信号，这时候activemq中的消息是删除了的，消息丢失了。设置message表就是有一个消息存根，activemq中消息丢失了message表中的消息还在。解决了activemq消息丢失问题 2、余额宝设计message表的目的当余额宝消费成功并且数据库操作成功时，回调支付宝的消息确认接口，如果回调接口时出现异常导致支付宝状态修改失败还是unconfirm状态，这时候还会被timer扫描到，又会往activemq插入消息，又会被余额宝消费一边，但是这条消息已经消费成功了的只是回调失败而已，所以就需要有一个这样的message表，当余额宝消费时先插入message表，如果message根据message_id能查询到记录就说明之前这条消息被消费过就不再消费只需要回调成功即可，如果查询不到消息就消费这条消息继续数据库操作，数据库操作成功就往message表插入消息。这样就解决了消息重复消费问题，这也是消费端的幂等操作。基于消息中间件的分布式事务是最理想的分布式事务解决方案，兼顾了安全性和并发性！接下来贴代码：支付宝代码： @Controller@RequestMapping("/order")public class OrderController {/ @Description TODO @param @return 参数 @return String 返回类型 @throws userID：转账的用户ID amount：转多少钱/@Autowired@Qualifier("activemq")OrderService orderService;@RequestMapping("/transfer")public @ResponseBody String transferAmount(String userId,String messageId, int amount) {try {orderService.updateAmount(amount,messageId, userId);}catch (Exception e) {e.printStackTrace();return "===============================transferAmount failed===================";}return "===============================transferAmount successfull===================";}@RequestMapping("/callback")public String callback(String param) {JSONObject parse = JSONObject.parseObject(param);String respCode = parse.getString("respCode");if(!"OK".equalsIgnoreCase(respCode)) {return null;}try {orderService.updateMessage(param);}catch (Exception e) {e.printStackTrace();return "fail";}return "ok";} } public interface OrderService {public void updateAmount(int amount, String userId,String messageId);public void updateMessage(String param);} @Service("activemq")@Transactional(rollbackFor = Exception.class)public class OrderServiceActivemqImpl implements OrderService {Logger logger = LoggerFactory.getLogger(getClass());@AutowiredJdbcTemplate jdbcTemplate;@AutowiredJmsTemplate jmsTemplate;@Overridepublic void updateAmount(final int amount, final String messageId, final String userId) {String sql = "update account set amount = amount - ?,update_time=now() where user_id = ?";int count = jdbcTemplate.update(sql, new Object[]{amount, userId});if (count == 1) {//插入到消息记录表sql = "insert into message(user_id,message_id,amount,status) values (?,?,?,?)";int row = jdbcTemplate.update(sql,new Object[]{userId,messageId,amount,"unconfirm"});if(row == 1) {//往activemq中插入消息jmsTemplate.send("zg.jack.queue", new MessageCreator() {@Overridepublic Message createMessage(Session session) throws JMSException {com.zhuguang.jack.bean.Message message = new com.zhuguang.jack.bean.Message();message.setAmount(Integer.valueOf(amount));message.setStatus("unconfirm");message.setUserId(userId);message.setMessageId(messageId);return session.createObjectMessage(message);} });} }}@Overridepublic void updateMessage(String param) {JSONObject parse = JSONObject.parseObject(param);String messageId = parse.getString("messageId");String sql = "update message set status = ? where message_id = ?";int count = jdbcTemplate.update(sql,new Object[]{"confirm",messageId});if(count == 1) {logger.info(messageId + " callback successfull");} }} activemq.xml <?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:amq="http://activemq.apache.org/schema/core"xmlns:jms="http://www.springframework.org/schema/jms"xmlns:context="http://www.springframework.org/schema/context"xmlns:mvc="http://www.springframework.org/schema/mvc"xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans-4.1.xsdhttp://www.springframework.org/schema/contexthttp://www.springframework.org/schema/context/spring-context-4.1.xsdhttp://www.springframework.org/schema/mvchttp://www.springframework.org/schema/mvc/spring-mvc-4.1.xsdhttp://www.springframework.org/schema/jmshttp://www.springframework.org/schema/jms/spring-jms-4.1.xsdhttp://activemq.apache.org/schema/corehttp://activemq.apache.org/schema/core/activemq-core-5.12.1.xsd"><context:component-scan base-package="com.zhuguang.jack" /><mvc:annotation-driven /><amq:connectionFactory id="amqConnectionFactory"brokerURL="tcp://192.168.88.131:61616"userName="system"password="manager" /><bean id="connectionFactory"class="org.springframework.jms.connection.CachingConnectionFactory"><constructor-arg ref="amqConnectionFactory" /><property name="sessionCacheSize" value="100" /></bean><bean id="demoQueueDestination" class="org.apache.activemq.command.ActiveMQQueue"><constructor-arg><value>zg.jack.queue</value></constructor-arg></bean><bean id="jmsTemplate" class="org.springframework.jms.core.JmsTemplate"><property name="connectionFactory" ref="connectionFactory" /><property name="defaultDestination" ref="demoQueueDestination" /><property name="receiveTimeout" value="10000" /><property name="pubSubDomain" value="false" /></bean></beans> spring-dispatcher.xml <beans xmlns="http://www.springframework.org/schema/beans"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:p="http://www.springframework.org/schema/p"xmlns:context="http://www.springframework.org/schema/context"xmlns:task="http://www.springframework.org/schema/task" xmlns:aop="http://www.springframework.org/schema/aop"xmlns:tx="http://www.springframework.org/schema/tx"xmlns:util="http://www.springframework.org/schema/util" xmlns:mvc="http://www.springframework.org/schema/mvc"xsi:schemaLocation="http://www.springframework.org/schema/utilhttp://www.springframework.org/schema/util/spring-util-3.2.xsdhttp://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.2.xsdhttp://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.2.xsdhttp://www.springframework.org/schema/mvchttp://www.springframework.org/schema/mvc/spring-mvc-3.2.xsdhttp://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task-3.0.xsdhttp://www.springframework.org/schema/txhttp://www.springframework.org/schema/tx/spring-tx-3.0.xsdhttp://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-3.0.xsd"><import resource="../activemq/activemq.xml"/><bean id="propertyConfigurerForProject1" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer"><property name="order" value="1" /><property name="ignoreUnresolvablePlaceholders" value="true" /><property name="location"><value>classpath:config/core/core.properties</value></property></bean><mvc:annotation-driven><mvc:message-converters register-defaults="true"><bean class="org.springframework.http.converter.StringHttpMessageConverter"><property name="supportedMediaTypes" value = "text/plain;charset=UTF-8" /></bean></mvc:message-converters></mvc:annotation-driven><bean id="mappingJacksonHttpMessageConverter" class="org.springframework.http.converter.json.MappingJacksonHttpMessageConverter"><property name="supportedMediaTypes"><list><value>text/html;charset=UTF-8</value></list></property></bean><context:component-scan base-package="com.zhuguang"></context:component-scan><mvc:view-controller path="/" view-name="redirect:/index" /><beanclass="org.springframework.web.servlet.mvc.annotation.DefaultAnnotationHandlerMapping" /><bean id="handlerAdapter"class="org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter"></bean><beanclass="org.springframework.web.servlet.view.ContentNegotiatingViewResolver"><property name="mediaTypes"><map><entry key="json" value="application/json" /><entry key="xml" value="application/xml" /><entry key="html" value="text/html" /></map></property><property name="viewResolvers"><list><bean class="org.springframework.web.servlet.view.BeanNameViewResolver" /><bean class="org.springframework.web.servlet.view.UrlBasedViewResolver"><property name="viewClass" value="org.springframework.web.servlet.view.JstlView" /><property name="prefix" value="/" /><property name="suffix" value=".jsp" /></bean></list></property></bean> <bean id="exceptionResolver"class="org.springframework.web.servlet.handler.SimpleMappingExceptionResolver"><property name="exceptionMappings"><props><prop key="java.lang.Exception">error</prop></props></property></bean><bean id="dataSource" class="com.mchange.v2.c3p0.ComboPooledDataSource" destroy-method="close"><property name="driverClass"><value>${jdbc.driverClassName}</value></property><property name="jdbcUrl"><value>${jdbc.url}</value></property><property name="user"><value>${jdbc.username}</value></property><property name="password"><value>${jdbc.password}</value></property><property name="minPoolSize" value="10" /><property name="maxPoolSize" value="100" /><property name="maxIdleTime" value="1800" /><property name="acquireIncrement" value="3" /><property name="maxStatements" value="1000" /><property name="initialPoolSize" value="10" /><property name="idleConnectionTestPeriod" value="60" /><property name="acquireRetryAttempts" value="30" /><property name="breakAfterAcquireFailure" value="false" /><property name="testConnectionOnCheckout" value="false" /><property name="acquireRetryDelay"><value>100</value></property></bean><bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate"><property name="dataSource" ref="dataSource"></property></bean><bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager"><property name="dataSource" ref="dataSource"/></bean><tx:annotation-driven transaction-manager="transactionManager" proxy-target-class="true" /><aop:aspectj-autoproxy expose-proxy="true"/></beans> logback.xml <?xml version="1.0" encoding="UTF-8"?><configuration scan="false" scanPeriod="60 seconds" debug="false"><property name="appName" value="netty"></property><appender name="stdout" class="ch.qos.logback.core.ConsoleAppender"><Encoding>UTF-8</Encoding><encoder><pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{50} - %msg%n</pattern></encoder></appender> <appender name="appLogAppender" class="ch.qos.logback.core.rolling.RollingFileAppender"><Encoding>UTF-8</Encoding> <file>${appName}.log</file><rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"><fileNamePattern>${appName}-%d{yyyy-MM-dd}-%i.log</fileNamePattern><MaxHistory>365</MaxHistory><timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP"><maxFileSize>100MB</maxFileSize></timeBasedFileNamingAndTriggeringPolicy></rollingPolicy> <encoder><pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [ %thread ] - [ %-5level ] [ %logger{50} : %line ] - %msg%n</pattern></encoder></appender><root level="debug"><appender-ref ref="stdout" /><appender-ref ref="appLogAppender" /></root></configuration> 2、余额宝代码 package com.zhuguang.jack.controller;import com.alibaba.fastjson.JSONObject;import com.zhuguang.jack.service.OrderService;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.stereotype.Controller;import org.springframework.web.bind.annotation.RequestMapping;import org.springframework.web.bind.annotation.ResponseBody;@Controller@RequestMapping("/order")public class OrderController {/ @Description TODO @param @return 参数 @return String 返回类型 @throws 模拟银行转账 userID：转账的用户ID amount：转多少钱/@AutowiredOrderService orderService;@RequestMapping("/transfer")public @ResponseBody String transferAmount(String userId, String amount) {try {orderService.updateAmount(Integer.valueOf(amount), userId);}catch (Exception e) {e.printStackTrace();return "===============================transferAmount failed===================";}return "===============================transferAmount successfull===================";} } 消息监听器 package com.zhuguang.jack.listener;import com.alibaba.fastjson.JSONObject;import com.zhuguang.jack.service.OrderService;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.http.client.SimpleClientHttpRequestFactory;import org.springframework.stereotype.Service;import org.springframework.transaction.annotation.Transactional;import org.springframework.web.client.RestTemplate;import javax.jms.JMSException;import javax.jms.Message;import javax.jms.MessageListener;import javax.jms.ObjectMessage;@Service("queueMessageListener")public class QueueMessageListener implements MessageListener {private Logger logger = LoggerFactory.getLogger(getClass());@AutowiredOrderService orderService;@Transactional(rollbackFor = Exception.class)@Overridepublic void onMessage(Message message) {if (message instanceof ObjectMessage) {ObjectMessage objectMessage = (ObjectMessage) message;try {com.zhuguang.jack.bean.Message message1 = (com.zhuguang.jack.bean.Message) objectMessage.getObject();String userId = message1.getUserId();int count = orderService.queryMessageCountByUserId(userId);if (count == 0) {orderService.updateAmount(message1.getAmount(), message1.getUserId());orderService.insertMessage(message1.getUserId(), message1.getMessageId(), message1.getAmount(), "ok");} else {logger.info("异常转账");}RestTemplate restTemplate = createRestTemplate();JSONObject jo = new JSONObject();jo.put("messageId", message1.getMessageId());jo.put("respCode", "OK");String url = "http://jack.bank_a.com:8080/alipay/order/callback?param="+ jo.toJSONString();restTemplate.getForObject(url,null);} catch (JMSException e) {e.printStackTrace();throw new RuntimeException("异常");} }}public RestTemplate createRestTemplate() {SimpleClientHttpRequestFactory simpleClientHttpRequestFactory = new SimpleClientHttpRequestFactory();simpleClientHttpRequestFactory.setConnectTimeout(3000);simpleClientHttpRequestFactory.setReadTimeout(2000);return new RestTemplate(simpleClientHttpRequestFactory);} } package com.zhuguang.jack.service;public interface OrderService {public void updateAmount(int amount, String userId);public int queryMessageCountByUserId(String userId);public int insertMessage(String userId,String messageId,int amount,String status);} package com.zhuguang.jack.service;import org.slf4j.Logger;import org.slf4j.LoggerFactory;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.http.client.SimpleClientHttpRequestFactory;import org.springframework.jdbc.core.JdbcTemplate;import org.springframework.stereotype.Service;import org.springframework.transaction.annotation.Transactional;import org.springframework.web.client.RestTemplate;@Service@Transactional(rollbackFor = Exception.class)public class OrderServiceImpl implements OrderService {private Logger logger = LoggerFactory.getLogger(getClass());@AutowiredJdbcTemplate jdbcTemplate;/ 更新数据库表，把账户余额减去amountd/@Overridepublic void updateAmount(int amount, String userId) {//1、农业银行转账3000，也就说农业银行jack账户要减3000String sql = "update account set amount = amount + ?,update_time=now() where user_id = ?";int count = jdbcTemplate.update(sql, new Object[] {amount, userId});if (count != 1) {throw new RuntimeException("订单创建失败，农业银行转账失败！");} }public RestTemplate createRestTemplate() {SimpleClientHttpRequestFactory simpleClientHttpRequestFactory = new SimpleClientHttpRequestFactory();simpleClientHttpRequestFactory.setConnectTimeout(3000);simpleClientHttpRequestFactory.setReadTimeout(2000);return new RestTemplate(simpleClientHttpRequestFactory);}@Overridepublic int queryMessageCountByUserId(String messageId) {String sql = "select count() from message where message_id = ?";int count = jdbcTemplate.queryForInt(sql, new Object[]{messageId});return count;}@Overridepublic int insertMessage(String userId, String message_id,int amount, String status) {String sql = "insert into message(user_id,message_id,amount,status) values(?,?,?)";int count = jdbcTemplate.update(sql, new Object[]{userId, message_id,amount, status});if(count == 1) {logger.info("Ok");}return count;} } activemq.xml <?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:amq="http://activemq.apache.org/schema/core"xmlns:jms="http://www.springframework.org/schema/jms"xmlns:context="http://www.springframework.org/schema/context"xmlns:mvc="http://www.springframework.org/schema/mvc"xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans-4.1.xsdhttp://www.springframework.org/schema/contexthttp://www.springframework.org/schema/context/spring-context-4.1.xsdhttp://www.springframework.org/schema/mvchttp://www.springframework.org/schema/mvc/spring-mvc-4.1.xsdhttp://www.springframework.org/schema/jmshttp://www.springframework.org/schema/jms/spring-jms-4.1.xsdhttp://activemq.apache.org/schema/corehttp://activemq.apache.org/schema/core/activemq-core-5.12.1.xsd"><context:component-scan base-package="com.zhuguang.jack" /><mvc:annotation-driven /><amq:connectionFactory id="amqConnectionFactory"brokerURL="tcp://192.168.88.131:61616"userName="system"password="manager" /><bean id="connectionFactory"class="org.springframework.jms.connection.CachingConnectionFactory"><constructor-arg ref="amqConnectionFactory" /><property name="sessionCacheSize" value="100" /></bean><bean id="demoQueueDestination" class="org.apache.activemq.command.ActiveMQQueue"><constructor-arg><value>zg.jack.queue</value></constructor-arg></bean><bean id="queueListenerContainer"class="org.springframework.jms.listener.DefaultMessageListenerContainer"><property name="connectionFactory" ref="connectionFactory" /><property name="destination" ref="demoQueueDestination" /><property name="messageListener" ref="queueMessageListener" /></bean><bean id="jmsTemplate" class="org.springframework.jms.core.JmsTemplate"><property name="connectionFactory" ref="connectionFactory" /><property name="defaultDestination" ref="demoQueueDestination" /><property name="receiveTimeout" value="10000" /><property name="pubSubDomain" value="false" /></bean></beans> OK~~~~~~~~~~~~大功告成！！！，如果大家觉得满意并且对技术感兴趣请加群：171239762，纯技术交流群，非诚勿扰。本篇文章为转载内容。原文链接：https://blog.csdn.net/luoyang_java/article/details/84953241。该文由互联网用户投稿提供，文中观点代表作者本人意见，并不代表本站的立场。作为信息平台，本站仅提供文章转载服务，并不拥有其所有权，也不对文章内容的真实性、准确性和合法性承担责任。如发现本文存在侵权、违法、违规或事实不符的情况，请及时联系我们，我们将第一时间进行核实并删除相应内容。

2023-04-16 22:34:52

499

转载

转载文章

[转载]Linux Mysql 搭建

...内容。 Linux Mysql 搭建 systemctl stop firewalld 停止防火墙服务systemctl disable firewalld 禁止防火墙服务开机自启动sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux // 将 selinux文件中的SELINUX值修改为disabledwget -O /etc/yum.repos.d/openEulerOS.repo https://repo.huaweicloud.com/repository/conf/openeuler_aarch64.repo 增加openEulerOS.repo yum clean all 清除之前的所有仓库缓存yum makecache 生成软件包信息缓存,以提高搜索安装软件的速度dnf install mysqlmkdir /var/lib/mysql // 在 /var/lib 目录下创建一个mysql 目录cd /var/lib/mysql/ // 切换到这个目录mkdir data tmp run log // 在 mysql目录下创建 data, tmp,run,log 四个子目录touch /var/lib/mysql/log/mysql.log // 在log 目录下创建mysql.log空文件chown -R mysql:mysql /var/lib/mysql/ // 将 mysql目录下的所有文件所有者及群组都设为 mysqlrm -f /etc/my.cnf// 将一些信息导入到 my.cnf 中echo -e "[mysqld_safe]\nlog-error=/var/lib/mysql/log/mysql.log\npid-file=/var/lib/mysql/run/mysqld.pid\n\n[mysqldump]\nquick\n\n[mysql]\nno-auto-rehash\n\n[client]\nport=3306\nmax_allowed_packet=64M\ndefault-character-set=utf8\n\n[mysqld]\nuser=root\nport=3306\nbasedir=/usr/local/mysql\nsocket=/var/lib/mysql/run/mysql.sock\ntmpdir=/var/lib/mysql/tmp\ndatadir=/var/lib/mysql/data\ndefault_authentication_plugin=mysql_native_password\nskip-grant-tables\nkey_buffer_size=16M" > /etc/my.cnfcat /etc/my.cnf // 查看文件内容chown mysql:mysql /etc/my.cnf // 将该文件的所有者及群组都设为 mysqlll /etc/my.cnfchmod 777 /usr/local/mysql/support-files/mysql.server //对mysql.server的所有者,群组,其他用户设置读，写，执行，权限cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqlchkconfig mysql on // 开机自动启动chown -R mysql:mysql /etc/init.d/mysqlvi /etc/profile // 把 export PATH=$PATH:/usr/local/mysql/bin 放到文件尾端,设置环境变量source /etc/profile // 重新执行刚修改的文件,使之立即生效env // 显示系统的环境变量mysqld --defaults-file=/etc/my.cnf --initializechown -R mysql:mysql /var/lib/mysql/datall /var/lib/mysql/dataservice mysql startservice mysql status // 查看服务状态ps -ef | grep mysqlnetstat -anptnetstat -anpt | grep mysqlnetstat -anpt | grep 3306 显示有关mysql的进程mysql -u root -p -S /var/lib/mysql/run/mysql.sock // 输入密码进入到了mysqlalter user 'root'@'localhost' identified by "123456";flush privileges;create user 'user'@'%' identified by '123456';grant all privileges on . to 'user'@'%' with grant option;flush privileges;select user,host from mysql.user; service mysql stop 停止服务\q回到命令行vi /etc/ld.so.confldconfig 搜索出可共享的动态链接库(格式如lib.so),进而创建出动态装入程序(ld.so)所需的连接和缓存文件。缓存文件默认为/etc/ld.so.cacheln -s /var/ldconfiglib/mysql/run/mysql.sock /tmp/mysql.sock 建立软连接 service 和 chkconfig 都可以用 systemctl 来代替遇到 Can’t connect to local MySQL server through socket ‘/tmp/mysql.sock’ (2) service mysql stop // 先停用ln -s /var/lib/mysql/mysql.sock /tmp/mysql.sock // 建立软连接vi /etc/my.cnf // 修改里面的 socket 路径service mysql start // 重启 Linux chmod 命令 Linux文件的所有者、群组和其他人本篇文章为转载内容。原文链接：https://blog.csdn.net/qq_53318060/article/details/121664128。该文由互联网用户投稿提供，文中观点代表作者本人意见，并不代表本站的立场。作为信息平台，本站仅提供文章转载服务，并不拥有其所有权，也不对文章内容的真实性、准确性和合法性承担责任。如发现本文存在侵权、违法、违规或事实不符的情况，请及时联系我们，我们将第一时间进行核实并删除相应内容。

2023-05-24 19:00:46

119

转载

Kylin

Kylin配置详解：实现跨Hadoop集群数据源查询与Cube构建，整合JDBC连接与HBase REST服务

.... 配置跨集群数据源连接 2.1 配置远程数据源连接首先，我们需要在Kylin的kylin.properties配置文件中指定远程数据源的相关信息。例如，假设我们的原始数据位于一个名为“ClusterA”的Hadoop集群： properties kylin.source.hdfs-working-dir=hdfs://ClusterA:8020/user/kylin/ kylin.storage.hbase.rest-url=http://ClusterA:60010/ 这里，我们设置了HDFS的工作目录以及HBase REST服务的URL地址，确保Kylin能访问到ClusterA上的数据。 2.2 配置数据源连接器（JDBC）对于关系型数据库作为数据源的情况，还需要配置相应的JDBC连接信息。例如，若ClusterB上有一个MySQL数据库： properties kylin.source.jdbc.url=jdbc:mysql://ClusterB:3306/mydatabase?useSSL=false kylin.source.jdbc.user=myuser kylin.source.jdbc.pass=mypassword 3. 创建项目及模型并关联远程表接下来，在Kylin的Web界面创建一个新的项目，并在该项目下定义数据模型。在选择数据表时，Kylin会根据之前配置的HDFS和JDBC连接信息自动发现远程集群中的表。 - 创建项目：在Kylin管理界面点击"Create Project"，填写项目名称和描述等信息。 - 定义模型：在新建的项目下，点击"Model" -> "Create Model"，添加从远程集群引用的表，并设计所需的维度和度量。 4. 构建Cube并对跨集群数据进行查询完成模型定义后，即可构建Cube。Kylin会在后台执行MapReduce任务，读取远程集群的数据并进行预计算。构建完成后，您便可以针对这个Cube进行快速、高效的查询操作，即使这些数据分布在不同的集群上。 bash 在Kylin命令行工具中构建Cube ./bin/kylin.sh org.apache.kylin.tool.BuildCubeCommand --cube-name MyCube --project-name MyProject --build-type BUILD 至此，通过精心配置和一系列操作，您的Kylin环境已经成功支持了跨集群的数据源查询。在这一路走来，我们不断挠头琢磨、摸石头过河、动手实践，不仅硬生生攻克了技术上的难关，更是让Kylin在各种复杂环境下的强大适应力和灵活应变能力展露无遗。总结起来，配置Kylin支持跨集群查询的关键在于正确设置数据源连接，并在模型设计阶段合理引用这些远程数据源。每一次操作都像是人类智慧的一次小小爆发，每查询成功的背后，都是我们对Kylin功能那股子钻研劲儿和精心打磨的成果。在这整个过程中，我们实实在在地感受到了Kylin这款大数据处理神器的厉害之处，它带来的便捷性和无限可能性，真是让我们大开眼界，赞不绝口啊！

2023-01-26 10:59:48

月下独酌

转载文章

[转载]mysql的配置文件的各项参数意思

...其他默认调优值 MySQL Server Instance Configuration File MySQL服务器实例配置文件 ---------------------------------------------------------------------- Generated by the MySQL Server Instance Configuration Wizard 由MySQL服务器实例配置向导生成 Installation Instructions 安装说明 ---------------------------------------------------------------------- On Linux you can copy this file to /etc/my.cnf to set global options, mysql-data-dir/my.cnf to set server-specific options (@localstatedir@ for this installation) or to ~/.my.cnf to set user-specific options. 在Linux上，您可以将该文件复制到/etc/my.cnf来设置全局选项，mysql-data-dir/my.cnf来设置特定于服务器的选项(此安装的@localstatedir@)，或者~/.my.cnf来设置特定于用户的选项。 On Windows you should keep this file in the installation directory of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To make sure the server reads the config file use the startup option "--defaults-file". 在Windows上你应该保持这个文件在服务器的安装目录(例如C:\Program Files\MySQL\MySQL服务器X.Y)。要确保服务器读取配置文件，请使用启动选项“——default -file”。 To run the server from the command line, execute this in a command line shell, e.g. mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" 要从命令行运行服务器，请在命令行shell中执行，例如mysqld——default -file="C:\Program Files\MySQL\MySQL server X.Y\my.ini" To install the server as a Windows service manually, execute this in a command line shell, e.g. mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" 要手动将服务器安装为Windows服务，请在命令行shell中执行此操作，例如mysqld——install MySQLXY——default -file="C:\Program Files\MySQL\MySQL server X.Y\my.ini" And then execute this in a command line shell to start the server, e.g. net start MySQLXY 然后在命令行shell中执行这个命令来启动服务器，例如net start MySQLXY Guidelines for editing this file编辑此文件的指南 ---------------------------------------------------------------------- In this file, you can use all long options that the program supports. If you want to know the options a program supports, start the program with the "--help" option. 在这个文件中，您可以使用程序支持的所有长选项。如果您想知道程序支持的选项，请使用“——help”选项启动程序。 More detailed information about the individual options can also be found in the manual. For advice on how to change settings please see https://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html 有关各个选项的更详细信息也可以在手册中找到。有关如何更改设置的建议，请参见https://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html CLIENT SECTION 客户端部分 ---------------------------------------------------------------------- The following options will be read by MySQL client applications. Note that only client applications shipped by MySQL are guaranteed to read this section. If you want your own MySQL client program to honor these values, you need to specify it as an option during the MySQL client library initialization. MySQL客户机应用程序将读取以下选项。注意，只有MySQL提供的客户端应用程序才能阅读本节。如果您希望自己的MySQL客户机程序遵守这些值，您需要在初始化MySQL客户机库时将其指定为一个选项。 [client] pipe= socket=MYSQL port=3306 [mysql] no-beep default-character-set= SERVER SECTION 服务器部分 ---------------------------------------------------------------------- The following options will be read by the MySQL Server. Make sure that you have installed the server correctly (see above) so it reads this file. MySQL服务器将读取以下选项。确保您已经正确安装了服务器(参见上面)，以便它读取这个文件。 server_type=3 [mysqld] The next three options are mutually exclusive to SERVER_PORT below. 下面的三个选项对SERVER_PORT是互斥的。skip-networking enable-named-pipe 共享内存 skip-networking enable-named-pipe shared-memory shared-memory-base-name=MYSQL The Pipe the MySQL Server will use socket=MYSQL The TCP/IP Port the MySQL Server will listen on port=3306 Path to installation directory. All paths are usually resolved relative to this. basedir="C:/Program Files/MySQL/MySQL Server 8.0/" Path to the database root datadir=C:/ProgramData/MySQL/MySQL Server 8.0/Data The default character set that will be used when a new schema or table is created and no character set is defined 创建新模式或表时使用的默认字符集，并且没有定义字符集 character-set-server= The default authentication plugin to be used when connecting to the server 连接到服务器时使用的默认身份验证插件 default_authentication_plugin=caching_sha2_password The default storage engine that will be used when create new tables when 当创建新表时将使用的默认存储引擎 default-storage-engine=INNODB Set the SQL mode to strict 将SQL模式设置为strict sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION" General and Slow logging. 一般和缓慢的日志。 log-output=NONE general-log=0 general_log_file="DESKTOP-NF9QETB.log" slow-query-log=0 slow_query_log_file="DESKTOP-NF9QETB-slow.log" long_query_time=10 Binary Logging. 二进制日志。 log-bin Error Logging. 错误日志记录。 log-error="DESKTOP-NF9QETB.err" Server Id. server-id=1 Indicates how table and database names are stored on disk and used in MySQL. 指示表名和数据库名如何存储在磁盘上并在MySQL中使用。 Value = 0: Table and database names are stored on disk using the lettercase specified in the CREATE TABLE or CREATE DATABASE statement. Name comparisons are case sensitive. You should not set this variable to 0 if you are running MySQL on a system that has case-insensitive file names (such as Windows or macOS). Value = 0:表名和数据库名使用CREATE Table或CREATE database语句中指定的lettercase存储在磁盘上。名称比较区分大小写。如果您在一个具有不区分大小写文件名(如Windows或macOS)的系统上运行MySQL，则不应将该变量设置为0。 Value = 1: Table names are stored in lowercase on disk and name comparisons are not case-sensitive. MySQL converts all table names to lowercase on storage and lookup. This behavior also applies to database names and table aliases. 表名以小写存储在磁盘上，并且名称比较不区分大小写。MySQL在存储和查找时将所有表名转换为小写。此行为也适用于数据库名称和表别名。 Value = 3, Table and database names are stored on disk using the lettercase specified in the CREATE TABLE or CREATE DATABASE statement, but MySQL converts them to lowercase on lookup. Name comparisons are not case sensitive. This works only on file systems that are not case-sensitive! InnoDB table names and view names are stored in lowercase, as for Value = 1.表名和数据库名使用CREATE Table或CREATE database语句中指定的lettercase存储在磁盘上，但是MySQL在查找时将它们转换为小写。名称比较不区分大小写。这只适用于不区分大小写的文件系统!InnoDB表名和视图名以小写存储，Value = 1。 NOTE: lower_case_table_names can only be configured when initializing the server. Changing the lower_case_table_names setting after the server is initialized is prohibited. lower_case_table_names=1 Secure File Priv. 权限安全文件 secure-file-priv="C:/ProgramData/MySQL/MySQL Server 8.0/Uploads" The maximum amount of concurrent sessions the MySQL server will allow. One of these connections will be reserved for a user with SUPER privileges to allow the administrator to login even if the connection limit has been reached. MySQL服务器允许的最大并发会话量。这些连接中的一个将保留给具有超级特权的用户，以便允许管理员登录，即使已经达到连接限制。 max_connections=151 The number of open tables for all threads. Increasing this value increases the number of file descriptors that mysqld requires. Therefore you have to make sure to set the amount of open files allowed to at least 4096 in the variable "open-files-limit" in 为所有线程打开的表的数量。增加这个值会增加mysqld需要的文件描述符的数量。因此，您必须确保在[mysqld_safe]节中的变量“open-files-limit”中将允许打开的文件数量至少设置为4096 section [mysqld_safe] table_open_cache=2000 Maximum size for internal (in-memory) temporary tables. If a table grows larger than this value, it is automatically converted to disk based table This limitation is for a single table. There can be many of them. 内部(内存)临时表的最大大小。如果一个表比这个值大，那么它将自动转换为基于磁盘的表。可以有很多。 tmp_table_size=94M How many threads we should keep in a cache for reuse. When a client disconnects, the client's threads are put in the cache if there aren't more than thread_cache_size threads from before. This greatly reduces the amount of thread creations needed if you have a lot of new connections. (Normally this doesn't give a notable performance improvement if you have a good thread implementation.) 我们应该在缓存中保留多少线程以供重用。当客户机断开连接时，如果之前的线程数不超过thread_cache_size，则将客户机的线程放入缓存。如果您有很多新连接，这将大大减少所需的线程创建量(通常，如果您有一个良好的线程实现，这不会带来显著的性能改进)。 thread_cache_size=10 MyISAM Specific options The maximum size of the temporary file MySQL is allowed to use while recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE. If the file-size would be bigger than this, the index will be created through the key cache (which is slower). MySQL允许在重新创建索引时(在修复、修改表或加载数据时)使用临时文件的最大大小。如果文件大小大于这个值，那么索引将通过键缓存创建(这比较慢)。 myisam_max_sort_file_size=100G If the temporary file used for fast index creation would be bigger than using the key cache by the amount specified here, then prefer the key cache method. This is mainly used to force long character keys in large tables to use the slower key cache method to create the index. myisam_sort_buffer_size=179M Size of the Key Buffer, used to cache index blocks for MyISAM tables. Do not set it larger than 30% of your available memory, as some memory is also required by the OS to cache rows. Even if you're not using MyISAM tables, you should still set it to 8-64M as it will also be used for internal temporary disk tables. 如果用于快速创建索引的临时文件比这里指定的使用键缓存的文件大，则首选键缓存方法。这主要用于强制大型表中的长字符键使用较慢的键缓存方法来创建索引。 key_buffer_size=8M Size of the buffer used for doing full table scans of MyISAM tables. Allocated per thread, if a full scan is needed. 用于对MyISAM表执行全表扫描的缓冲区的大小。如果需要完整的扫描，则为每个线程分配。 read_buffer_size=256K read_rnd_buffer_size=512K INNODB Specific options INNODB特定选项 innodb_data_home_dir= Use this option if you have a MySQL server with InnoDB support enabled but you do not plan to use it. This will save memory and disk space and speed up some things. 如果您启用了一个支持InnoDB的MySQL服务器，但是您不打算使用它，那么可以使用这个选项。这将节省内存和磁盘空间，并加快一些事情。skip-innodb skip-innodb If set to 1, InnoDB will flush (fsync) the transaction logs to the disk at each commit, which offers full ACID behavior. If you are willing to compromise this safety, and you are running small transactions, you may set this to 0 or 2 to reduce disk I/O to the logs. Value 0 means that the log is only written to the log file and the log file flushed to disk approximately once per second. Value 2 means the log is written to the log file at each commit, but the log file is only flushed to disk approximately once per second. 如果设置为1,InnoDB将在每次提交时将事务日志刷新(fsync)到磁盘，这将提供完整的ACID行为。如果您愿意牺牲这种安全性，并且正在运行小型事务，您可以将其设置为0或2，以将磁盘I/O减少到日志。值0表示日志仅写入日志文件，日志文件大约每秒刷新一次磁盘。值2表示日志在每次提交时写入日志文件，但是日志文件大约每秒只刷新一次磁盘。 innodb_flush_log_at_trx_commit=1 The size of the buffer InnoDB uses for buffering log data. As soon as it is full, InnoDB will have to flush it to disk. As it is flushed once per second anyway, it does not make sense to have it very large (even with long transactions).InnoDB用于缓冲日志数据的缓冲区大小。一旦它满了，InnoDB就必须将它刷新到磁盘。由于它无论如何每秒刷新一次，所以将它设置为非常大的值是没有意义的(即使是长事务)。 innodb_log_buffer_size=5M InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and row data. The bigger you set this the less disk I/O is needed to access data in tables. On a dedicated database server you may set this parameter up to 80% of the machine physical memory size. Do not set it too large, though, because competition of the physical memory may cause paging in the operating system. Note that on 32bit systems you might be limited to 2-3.5G of user level memory per process, so do not set it too high. 与MyISAM不同，InnoDB使用缓冲池来缓存索引和行数据。设置的值越大，访问表中的数据所需的磁盘I/O就越少。在专用数据库服务器上，可以将该参数设置为机器物理内存大小的80%。但是，不要将它设置得太大，因为物理内存的竞争可能会导致操作系统中的分页。注意，在32位系统上，每个进程的用户级内存可能被限制在2-3.5G，所以不要设置得太高。 innodb_buffer_pool_size=20M Size of each log file in a log group. You should set the combined size of log files to about 25%-100% of your buffer pool size to avoid unneeded buffer pool flush activity on log file overwrite. However, note that a larger logfile size will increase the time needed for the recovery process. 日志组中每个日志文件的大小。您应该将日志文件的合并大小设置为缓冲池大小的25%-100%，以避免在覆盖日志文件时出现不必要的缓冲池刷新活动。但是，请注意，较大的日志文件大小将增加恢复过程所需的时间。 innodb_log_file_size=48M Number of threads allowed inside the InnoDB kernel. The optimal value depends highly on the application, hardware as well as the OS scheduler properties. A too high value may lead to thread thrashing. InnoDB内核中允许的线程数。最优值在很大程度上取决于应用程序、硬件以及OS调度程序属性。过高的值可能导致线程抖动。 innodb_thread_concurrency=9 The increment size (in MB) for extending the size of an auto-extend InnoDB system tablespace file when it becomes full. 增量大小(以MB为单位)，用于在表空间满时扩展自动扩展的InnoDB系统表空间文件的大小。 innodb_autoextend_increment=128 The number of regions that the InnoDB buffer pool is divided into. For systems with buffer pools in the multi-gigabyte range, dividing the buffer pool into separate instances can improve concurrency, by reducing contention as different threads read and write to cached pages. InnoDB缓冲池划分的区域数。对于具有多gb缓冲池的系统，将缓冲池划分为单独的实例可以提高并发性，因为不同的线程对缓存页面的读写会减少争用。 innodb_buffer_pool_instances=8 Determines the number of threads that can enter InnoDB concurrently. 确定可以同时进入InnoDB的线程数 innodb_concurrency_tickets=5000 Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. 指定插入到旧子列表中的块必须在第一次访问之后停留多长时间(毫秒)，然后才能移动到新子列表。 innodb_old_blocks_time=1000 It specifies the maximum number of .ibd files that MySQL can keep open at one time. The minimum value is 10. 它指定MySQL一次可以打开的.ibd文件的最大数量。最小值是10。 innodb_open_files=300 When this variable is enabled, InnoDB updates statistics during metadata statements. 当启用此变量时，InnoDB会在元数据语句期间更新统计信息。 innodb_stats_on_metadata=0 When innodb_file_per_table is enabled (the default in 5.6.6 and higher), InnoDB stores the data and indexes for each newly created table in a separate .ibd file, rather than in the system tablespace. 当启用innodb_file_per_table(5.6.6或更高版本的默认值)时，InnoDB将每个新创建的表的数据和索引存储在单独的.ibd文件中，而不是系统表空间中。 innodb_file_per_table=1 Use the following list of values: 0 for crc32, 1 for strict_crc32, 2 for innodb, 3 for strict_innodb, 4 for none, 5 for strict_none. 使用以下值列表:0表示crc32, 1表示strict_crc32, 2表示innodb, 3表示strict_innodb, 4表示none, 5表示strict_none。 innodb_checksum_algorithm=0 The number of outstanding connection requests MySQL can have. This option is useful when the main MySQL thread gets many connection requests in a very short time. It then takes some time (although very little) for the main thread to check the connection and start a new thread. The back_log value indicates how many requests can be stacked during this short time before MySQL momentarily stops answering new requests. You need to increase this only if you expect a large number of connections in a short period of time. MySQL可以有多少未完成连接请求。当MySQL主线程在很短的时间内收到许多连接请求时，这个选项非常有用。然后，主线程需要一些时间(尽管很少)来检查连接并启动一个新线程。back_log值表示在MySQL暂时停止响应新请求之前的短时间内可以堆多少个请求。只有当您预期在短时间内会有大量连接时，才需要增加这个值。 back_log=80 If this is set to a nonzero value, all tables are closed every flush_time seconds to free up resources and synchronize unflushed data to disk. This option is best used only on systems with minimal resources. 如果将该值设置为非零值，则每隔flush_time秒关闭所有表，以释放资源并将未刷新的数据同步到磁盘。这个选项最好只在资源最少的系统上使用。 flush_time=0 The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use 用于普通索引扫描、范围索引扫描和不使用索引执行全表扫描的连接的缓冲区的最小大小。 indexes and thus perform full table scans. join_buffer_size=200M The maximum size of one packet or any generated or intermediate string, or any parameter sent by the mysql_stmt_send_long_data() C API function. 由mysql_stmt_send_long_data() C API函数发送的一个包或任何生成的或中间字符串或任何参数的最大大小 max_allowed_packet=500M If more than this many successive connection requests from a host are interrupted without a successful connection, the server blocks that host from performing further connections. 如果在没有成功连接的情况下中断了来自主机的多个连续连接请求，则服务器将阻止主机执行进一步的连接。 max_connect_errors=100 Changes the number of file descriptors available to mysqld. You should try increasing the value of this option if mysqld gives you the error "Too many open files". 更改mysqld可用的文件描述符的数量。如果mysqld给您的错误是“打开的文件太多”，您应该尝试增加这个选项的值。 open_files_limit=4161 If you see many sort_merge_passes per second in SHOW GLOBAL STATUS output, you can consider increasing the sort_buffer_size value to speed up ORDER BY or GROUP BY operations that cannot be improved with query optimization or improved indexing. 如果在SHOW GLOBAL STATUS输出中每秒看到许多sort_merge_passes，可以考虑增加sort_buffer_size值，以加快ORDER BY或GROUP BY操作的速度，这些操作无法通过查询优化或改进索引来改进。 sort_buffer_size=1M The number of table definitions (from .frm files) that can be stored in the definition cache. If you use a large number of tables, you can create a large table definition cache to speed up opening of tables. The table definition cache takes less space and does not use file descriptors, unlike the normal table cache. The minimum and default values are both 400. 可以存储在定义缓存中的表定义的数量(来自.frm文件)。如果使用大量表，可以创建一个大型表定义缓存来加速表的打开。与普通的表缓存不同，表定义缓存占用更少的空间，并且不使用文件描述符。最小值和默认值都是400。 table_definition_cache=1400 Specify the maximum size of a row-based binary log event, in bytes. Rows are grouped into events smaller than this size if possible. The value should be a multiple of 256. 指定基于行的二进制日志事件的最大大小，单位为字节。如果可能，将行分组为小于此大小的事件。这个值应该是256的倍数。 binlog_row_event_max_size=8K If the value of this variable is greater than 0, a replication slave synchronizes its master.info file to disk. (using fdatasync()) after every sync_master_info events. 如果该变量的值大于0，则复制奴隶将其主.info文件同步到磁盘。(在每个sync_master_info事件之后使用fdatasync())。 sync_master_info=10000 If the value of this variable is greater than 0, the MySQL server synchronizes its relay log to disk. (using fdatasync()) after every sync_relay_log writes to the relay log. 如果这个变量的值大于0,MySQL服务器将其中继日志同步到磁盘。(在每个sync_relay_log写入到中继日志之后使用fdatasync())。 sync_relay_log=10000 If the value of this variable is greater than 0, a replication slave synchronizes its relay-log.info file to disk. (using fdatasync()) after every sync_relay_log_info transactions. 如果该变量的值大于0，则复制奴隶将其中继日志.info文件同步到磁盘。(在每个sync_relay_log_info事务之后使用fdatasync())。 sync_relay_log_info=10000 Load mysql plugins at start."plugin_x ; plugin_y". 开始时加载mysql插件。“plugin_x;plugin_y” plugin_load The TCP/IP Port the MySQL Server X Protocol will listen on. MySQL服务器X协议将监听TCP/IP端口。 loose_mysqlx_port=33060 本篇文章为转载内容。原文链接：https://blog.csdn.net/mywpython/article/details/89499852。该文由互联网用户投稿提供，文中观点代表作者本人意见，并不代表本站的立场。作为信息平台，本站仅提供文章转载服务，并不拥有其所有权，也不对文章内容的真实性、准确性和合法性承担责任。如发现本文存在侵权、违法、违规或事实不符的情况，请及时联系我们，我们将第一时间进行核实并删除相应内容。

2023-10-08 09:56:02

129

转载

MySQL

使用Apache Sqoop从HDFS向MySQL数据导出：配置、映射器与分区键实践

...FS里的数据“搬”到MySQL数据库里去。为什么要将HDFS数据导出到MySQL？ Hadoop Distributed File System (HDFS) 是一种分布式文件系统，可以存储大量数据并提供高可用性和容错性。不过呢，HDFS这家伙可不懂SQL查询这门子事儿，所以啊，如果我们想对数据进行更深度的分析和复杂的查询操作，就得先把数据从HDFS里导出来，然后存到像是MySQL这样的SQL数据库中才行。步骤一：设置环境首先，我们需要确保已经安装了所有必要的工具和软件。以下是您可能需要的一些组件： - Apache Sqoop：这是一个用于在Hadoop和关系型数据库之间进行数据迁移的工具。 - MySQL：这是一个流行的开源关系型数据库管理系统。 - Java Development Kit (JDK)：这是开发Java应用程序所必需的一组工具。在Windows上，你可以在这里找到Java JDK的下载链接：https://www.oracle.com/java/technologies/javase-downloads.html 。在MacOS上，你可以在这里找到Java JDK的下载链接：https://jdk.java.net/15/ 步骤二：配置Hadoop和MySQL 在开始之前，请确保您的Hadoop和MySQL已经正确配置并运行。对于Hadoop，您可以查看以下教程：https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/SingleCluster.html 对于MySQL，您可以参考官方文档：https://dev.mysql.com/doc/refman/8.0/en/installing-binary-packages.html 步骤三：创建MySQL表在开始导出数据之前，我们需要在MySQL中创建一个表来存储数据。以下是一个简单的例子： CREATE TABLE students ( id int(11) NOT NULL AUTO_INCREMENT, name varchar(45) DEFAULT NULL, age int(11) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 这个表将包含学生的ID、姓名和年龄字段。步骤四：编写Sqoop脚本现在我们可以使用Sqoop将HDFS中的数据导入到MySQL表中。以下是一个基本的Sqoop脚本示例： bash -sqoop --connect jdbc:mysql://localhost:3306/test \ -m 1 \ --num-mappers 1 \ --target-dir /user/hadoop/students \ --delete-target-dir \ --split-by id \ --as-textfile \ --fields-terminated-by '|' \ --null-string 'NULL' \ --null-non-string '\\N' \ --check-column id \ --check-nulls \ --query "SELECT id, name, age FROM students WHERE age > 18" 这个脚本做了以下几件事： - 使用--connect选项连接到MySQL服务器和测试数据库。 - 使用-m和--num-mappers选项设置映射器的数量。在这个例子中，我们只有一个映射器。 - 使用--target-dir选项指定输出目录。在这个例子中，我们将数据导出到/user/hadoop/students目录下。 - 使用--delete-target-dir选项删除目标目录中的所有内容，以防数据冲突。 - 使用--split-by选项指定根据哪个字段进行拆分。在这个例子中，我们将数据按学生ID进行拆分。 - 使用--as-textfile选项指定数据格式为文本文件。 - 使用--fields-terminated-by选项指定字段分隔符。在这个例子中，我们将字段分隔符设置为竖线（|）。 - 使用--null-string和--null-non-string选项指定空值的表示方式。在这个例子中，我们将NULL字符串设置为空格，将非字符串空值设置为\\N。 - 使用--check-column和--check-nulls选项指定检查哪个字段和是否有空值。在这个例子中，我们将检查学生ID是否为空，并且如果有，将记录为NULL。 - 使用--query选项指定要从中读取数据的SQL查询语句。在这个例子中，我们只选择年龄大于18的学生。请注意，这只是一个基本的示例。实际的脚本可能会有所不同，具体取决于您的数据和需求。步骤五：运行Sqoop脚本最后，我们可以使用以下命令运行Sqoop脚本： bash -sqoop \ -Dmapreduce.job.user.classpath.first=true \ --libjars $SQOOP_HOME/lib/mysql-connector-java-8.0.24.jar \ --connect jdbc:mysql://localhost:3306/test \ -m 1 \ --num-mappers 1 \ --target-dir /user/hadoop/students \ --delete-target-dir \ --split-by id \ --as-textfile \ --fields-terminated-by '|' \ --null-string 'NULL' \ --null-non-string '\\N' \ --check-column id \ --check-nulls \ --query "SELECT id, name, age FROM students WHERE age > 18" 注意，我们添加了一个-Dmapreduce.job.user.classpath.first=true参数，这样就可以保证我们的自定义JAR包在任务的classpath列表中处于最前面的位置。如果一切正常，我们应该可以看到一条成功的消息，并且可以在MySQL中看到导出的数据。总结本文介绍了如何使用Apache Sqoop将HDFS中的数据导出到MySQL数据库。咱们先给环境捯饬得妥妥当当，然后捣鼓出一个MySQL表，再接再厉，编了个Sqoop脚本。最后，咱就让这个脚本大展身手，把数据导出溜溜的。希望这篇文章能帮助你解决这个问题！

2023-04-12 16:50:07

247

素颜如水_t

转载文章

[转载]java监听oracle aq,透过JMS监听Oracle AQ，在数据库变化时触发执行Java程序

...入队和出队消息等。 JDBC连接参数类 , Java Database Connectivity (JDBC) 连接参数类是Java程序中用于存储数据库连接信息的类。在文中提到的JmsConfig类中，包含了诸如用户名、密码、数据库URL以及目标队列名称等连接参数，这些参数被用于初始化和管理到Oracle数据库的JDBC连接，以便Java应用程序能够与Oracle AQ进行交互。 CustomDatumFactory与CustomDatum , 在Java与Oracle数据库类型转换过程中，CustomDatumFactory和CustomDatum是Oracle JDBC驱动提供的接口和类，用于将Oracle数据库的自定义数据类型（如在本例中的demo_queue_payload_type）转换为Java对象，以便在Java应用程序中处理。CustomDatumFactory负责创建和解析CustomDatum对象，而CustomDatum则表示一个可定制的数据结构，可以与数据库中的自定义数据类型相对应。

2023-12-17 14:22:22

138

转载

知识学习

实践的时候请根据实际情况谨慎操作。

随机学习一条linux命令：

chown user:group file - 改变文件的所有者和组。